ISO/IEC NWI xxx
Information
Technology
Procedures for
Achieving Content Consistency
In
ISO/IEC 11179
Metadata Registries
Working Paper
Draft 3.0
September 1999
Procedures for Achieving
Data Registry Content Consistency
Contents
-------------------------------------------------------------
Foreword
Introduction
1 Scope
2 References
3 Definitions
4
Component framework
5.0 REGISTER A DATA ELEMENT
5.1 General Procedures
5.1.1 Understanding
the Data Element
5.1.2 Content
Research
5.1.3 Definition and Permissible Values
5.1.4 Name
and Identifiers
5.1.5
Other Metadata Attributes
5.1.6 Data
Element Concept
5.1.7 Classification
Attributes
5.1.8 Quality
Control
5.2 International Standard with Enumerated
Domain
5.2.1 Understanding
the Data Element
5.2.2 Content
Research
5.2.3 Definition
and Permissible Values
5.2.4 Identify
and Name the Data Element
5.2.5 Other
Metadata Attributes
5.2.6 Data
Element Concept
5.2.7 Classification
5.2.8 Quality
Control
5.2.9 Other
Codes and Names from ISO 3166
5.2.10 Summary
of Attributes
5.3 International Standard with
Non-Enumerated Domain
5.3.1 Understanding
the Data Element
5.3.2 Content
Research
5.3.3 Definition
and Permissible Values
5.3.4 Identifying
and Naming the Data Element
5.3.5 Other
Metadata Attributes
5.3.6 Data
Element Concept
5.3.7 Classification
5.3.8 Quality
Control
5.3.9 Other
Data Elements in ISO 6709
5.3.10 Summary
of Metadata Attributes
5.4 Application Data Element
5.4.1 Understanding
the Data Element
5.4.2 Content
Research
5.4.3 Definition
and Permissible Values
5.4.4 Identify
and Name the Data Element
5.4.5 Other
Metadata Attributes
5.4.6 Data
Element Concept
5.4.7 Classification
5.4.8 Quality
Control
5.4.9 Related
Data Elements
5.4.10 Summary
of Metadata Attributes
5.5 Register a Group of Data Elements
5.5.1 Information
System Entity Group
5.5.2 Composite
Data Element
5.5.3 Use
Group
5.6 Linking of Data Elements
5.7 Registration of Associated
Sources/Documents
6.
Complex data
Annexes
A Bibliography
B Definitions of
representation class terms
C Principles of managing
shared data
D Data registry uses and
users
E Conceptual and logical data
models
F Table of Data Elements Attributes for Examples
G Top Down Approach to Data Element Registration
G.1 Biological Organisms
G.1.1 Data Element Concepts
G.1.2 Data Elements
G.1.3 Permissible Values
G.2 Biological Organism Types
G.2.1 Data Element Concepts
G.2.2 Data Elements
G.2.3 Permissible Values
G.3 Top Down Registration
Y Business Rules for Populating a Metadata Registry
Y.1 Data Element Definition
Y.1.1 Mandatory Rules
Y.1.1.1 Uniqueness
Y.1.1.2 Singular
Y.1.1.3 State
the Concept; Not Only its Negative
Y.1.1.4 Descriptive Phrase or Sentence
Y.1.1.5 Contain
Only Commonly Used Abbreviations
Y.1.1.6 No
Embedded Definitions
Y.1.2 Guidelines for Definitions
Y.1.2.1 Essential
Meaning of Concept
Y.1.2.2 Precise
and Unambiguous
Y.1.2.3 Concise
Y.1.2.4 Stand
Alone
Y.1.2.5 No
Embedded Information
Y.1.2.6 Avoid
Circular Reasoning
Y.1.2.7 Consistency
for Related Definitions
Y.1.3 Data Element Definition Syntax
Y.1.4 Terms Commonly Used in Definitions
Y.2 Representational Attributes
Y.2.1 Permissible Values
Y.2.2 Value Domain
Y.2.3 Representational Terms
Y.2.4 Example
Y.3 Identifying and Naming a Data Element
Y.3.1 Name Context
Y.3.2 Establish a Naming Convention
Y.3.3 Example of a Naming Convention
Y.3.4 Formulating a Data Element Name
Y.4 Identification
Y.4.1 Data Element Identifier and Identifier
Y.4.2 Versioning
Y.5 Conceptual Relationships
Y.5.1 Data Element Concept
Y.5.2 Conceptual Domain
Y.5.3 Value Meanings
Y.6 Classification
Y.7 Quality Review
Y.7.1 Registration Status
Y.7.2 Administrative Status
Y.8 Reference Documents
Foreword
ISO
(the International Organization for Standardization) and the IEC (the International
Electrotechnical Commission) form the specialized system for worldwide
standardization. National bodies that
are members of ISO or IEC participate in the development of International
Standards through technical committees established by the respective
organization to deal with particular fields of technical activity. ISO and IEC technical committees collaborate
in fields of mutual interest. Other
international organizations, governmental or non-governmental, in liaison with
ISO and IEC, also take part in the work.
This
document was prepared by ISO/IEC JTC 1/SC 32, Data Management and Interchange.
Introduction
The
exchange of metadata between ISO/IEC 11179 metadata registries depends not only
on registry software that conforms to the standard, but also on metadata
contents that are compatible between registries. While the standard has
provisions for data element specification and registration, there are pragmatic
issues pertaining to populating the registries with content. Based on the experiences of organizations
that are implementing the standard, a technical report to explore content
issues will help current and future users.
Well-formed
data elements and their domains can be recorded in a metadata registry as
"models" for potential reuse. Additional attributes may be required
to record essential facts about how a data element is used in an application,
e.g., for data quality, collection method, collection purpose, etc.
The
proposed revision of ISO/IEC 11179, Part 3, models a data element (DE) and its
associated components. A data element
consists of the data element concept plus its representation. Some questions raised in the process of
implementing registries concern this structure. Creation of an application data element frequently requires
additional qualification of the object class and/or property. Does this creation of an application element
always cause the creation of an application data element concept? Does the qualified concept inherit meaning
from the standard concept to which it is related, and is there an adequate
place in the current scheme to store this relationship? How are application DEC’s distinguished from
other DEC’s or is there a need to make such a distinction? These are examples of topics that might be explored
in a document addressing content consistency among registry implementations.
Conceptualization
and articulation of rules and relationships in the creation of object classes,
properties, data element concepts and data elements are needed. Explication of the various possible levels
of data elements and data element concepts and their relationships would
greatly assist in the creation of shareable, well-formed data. Relationship and inheritance from the most
abstract data element to the most concrete application data element needs to be
specified. Reuse of data value domains
should be enabled and regularized.
1 Scope
1.1 Background
A registry is a tool for the management of shareable
data; a comprehensive, authoritative source of reference information about
data. It does not contain data itself, but it provides information on the
definition, origin, source, and location of data. It supports the standard‑setting
process by recording and disseminating data standards, which facilitates data
sharing among organizations and users.
It provides links to documents that refer to data elements and to
information systems where data elements are used. When used in conjunction with an information database, the
registry enables users to better understand the information obtained.
This
Technical Report is based on the American National Standard Institute (ANSI)
X3.285:1999 Standard, Metamodel for the
Management of Shareable Data. The
standard specifies the structure of a data registry in the form of a conceptual
model. The conceptual model is more
abstract than a logical data model in that it does not consider how the data is
represented in any particular way. It
is not intended to be a logical data model for a computer system, much less a
physical model.
A
data registry contains the metadata that is necessary to clearly describe,
inventory, analyze, and classify data.
It provides an understanding of the meaning, representation, and
identification of a unit of data. The
ANSI X3.285 standard "outlines the information elements associated with a
data element concept that need to be available for determining the meaning of a
data element to be shared between systems.
The standard is a complement to the six-part International Organization
for Standardization/International Electrotechnical Commission (ISO/IEC) 11179
standard that describes the organization of a data registry for managing the
semantics of data elements in data systems."[1]
1.2 Purpose
The
purpose of this Technical Report is to describe business rules for the
registration of data elements and their attributes in a registry. This document is not a user’s guide for data
entry, but a guide for conceptualizing a data element and its components for
the purpose of consistently establishing good quality data elements.
1.3 Scope
The scope of this document
is limited to the essential components of a data element: the data element
identifier, registry name, definition, and example; data concept; conceptual
domain with its value meanings; and value domain with its permissible
values. This document is not concerned
with the entry of detailed metadata for documents, standards, systems, groups,
partners, and message sets.
2 References
ISO/IEC
DIS 11179-1, Information technology - Specification
and standardization of data elements - Part 1: Framework for the specification
and standardization of data elements
ISO/IEC
DIS 11179-2, Information technology -
Specification and standardization of data elements - Part 2: Classification for
data elements
ISO/IEC
11179-3:1994, Information technology -
Specification and standardization of data elements - Part 3: Basic attributes
of data elements
ISO/IEC
11179-4:1995, Information technology -
Specification and standardization of data elements - Part 4: Rules and guidelines for the formulation of
data definitions
ISO/IEC
11179-5:1995, Information technology -
Specification and standardization of data elements - Part 5: Naming and identification principles for
data elements
ISO/IEC
DIS 11179-6, Information technology -
Specification and standardization of data elements - Part 6: Registration of
data elements
ISO/IEC
TR 15452, Information Technology -
Specification of Data Value Domains
3
Definitions
For the purposes of this document, the following
definitions apply.
3.1
attribute: A characteristic of an object or entity.
3.2 conceptual domain: A
set of possible valid value meanings of a data element expressed without
representation.
3.3
context: A designation or description of the application
environment or discipline in which a name is applied or from which it
originates.
3.4 data element: A
unit of data for which the identification, meaning, representation and
permissible values are specified by means of a
set of attributes.
3.5 data
element concept (DEC): A concept that can be represented in the form
of a data element, described independently of any particular representation.
3.6 data element registry: An
information resource that describes the meaning and representational form of
data elements.
3.7 data
element representation: A data
element component consisting of a value domain and representation class.
3.8 data
identifier: A language
independent unique identifier of a data element within a registration
authority. An unambiguous name for an
object within a given context.
3.9 data
item: An occurrence
of a data element value.
3.10 data
value: An element of a value domain.
3.11 data
value domain: A set of
possible valid values of a data element expressed in a certain representation,
for a data element having a value domain.
3.12
enumerated domain: A value domain that is specified by a list of
all permissible values.
3.13
identifier: See data identifier.
3.14 international registration data identifier
(IRDI): The unique and registered identifier of a data
element.
3.15 metadata: Data that defines and
describes other data.
3.16 name: The primary means of
identification of objects and concepts for humans.
3.17 object class: A
set of ideas, abstractions, or things in the real world that can be identified
with explicit boundaries and meaning and whose properties and behavior follow
the same rules.
3.18
permissible value (label): An expression of a value meaning in a specific value domain.
3.19
property: A peculiarity
common to all members of an object class.
3.20
representation class: A
classification of types of representations.
3.21
structure set: A method of placing objects in context,
revealing relationships to other objects.
Examples include Entity-Relationship Models, taxonomies, and ontologies.
3.22
value meaning: A valid value in a conceptual domain.
3.23
value meaning identifier (VMID): A label that uniquely
identifies a value meaning.
4 Component framework
This clause presents a conceptual framework for structuring data elements and data element components in a registry. Data elements are ideally the result of a process of development, involving several types of abstraction, producing a series of "layers" related to each other by the method of abstraction used to produce one from the other. Layers usually progress from the most general (conceptual) to the most specific (ultimately, the physical layer, although a metadata registry would not contain these).
One could use layers to structure development of a system using the Zachman Framework, for instance, with the highest levels of definition contained in the business view, and development progressing to the implemented system level. The number and granularity of layers are driven by user requirements. This clause will describe several (non-exhaustive) possible layers, none of which are intended to be mandatory for any particular implementation.
The
members of each layer are called data element components. Components are envisioned as a set of
building blocks that can be assembled into data elements. Some components may also be members of a
registry in their own right.
4.1 Abstraction types
Abstraction
is a tool which has been well-developed by the object-oriented community. It is
used as a way of focussing on parts of the model of interest to a particular
process or function. The term
"abstraction" is used to refer both to the process and the results of
the process. Abstraction can be applied
to the registry environment as a way to articulate the development of
components and their relationships to each other.
Several
methods can be used to achieve the decomposition of layers from the most
abstract to the more concrete. Starting
with the most general conceptional notions and progressing to the data elements
in applications, these layers can be labeled by the type or types of
abstraction used to produce them from another level.
The
three types of abstraction of most interest to data element development are: decomposition/aggregation,
instantiation/classification, and
specialization/generalization.
·
Decomposition/aggregation relates an item to its
parts. Decomposition may be described
as "x is a part of y," or the part-of
relationship. The reverse, aggregation,
shows that y may be composed of x among other items.
·
Instantiation/classification relates an item to a class
of items. This is described as the is-a relationship, "x is a(n
instance of) y." Classification
reverses the relationship; y contains x as well as other items.
·
The
third type is specialization/generalization. This is a relationship between two classes,
where all items in one (subclass) are also in the other (superclass).
4.2 Conformance
Layers
of abstraction can be used to determine conformance of a registry
implementation to a standard.
Specification of the member classes and abstraction types used to
determine the layer members can be used to define conformance. This will lead to improved chances for
interoperability among registries.
4.3 Developing Layers of
Abstraction
The
process of deriving layers of abstraction for a registry can be described by a
series of examples. Some or all of
these layers may be useful for any given registry.
Abstraction
relationship types define the boundaries between layers. Rules for conformance may be derived from
both boundary abstraction and the relationships of the components of each
layer.
A
useful starting point is the set of real world things that the registry
attempts to model. These can be
described by the phrases "concepts (things, beings, ideas…),"
"things about them," "how they look," and "what they
mean." So, the first layer of
abstraction is the translation of these phrases to model entities (figure
1). Applying the abstraction process of
specialization, the result is that concepts become object classes, things about concepts become properties, how they look becomes representation, and what they mean becomes the conceptual domain. By this
transformation, the amorphous content of superclasses of things in the real
world become subclasses composed of entities of the model, subject to rules
governing their behavior.
Of
course, every model-based registry must include this layer. This is the basic assumption of model
building.
Within
the model, other layers of abstraction can be applied to produce model entities
of use to the developers and users. For
example, aggregation can be applied to the object class and property entities
to produce the data element concept. These can be related to conceptual domains
(which contain sets of value meanings) to produce a potentially useful entity,
the conceptual generic element
(figure 2).
Conceptual
generic elements consist of the attributes associated with their constituent
components. These serve to describe the
object class property and its value meanings without any particular
representation assigned. An example, using ISO 3166, would be to describe country identifier without specifying
which one of the seven possible representations for names or codes for countries
contained in ISO 3166 is preferred.
Consider
representation. It was mentioned
earlier as if it was a model entity, but it does not exist as such in the
model. Representation is a combination
of data value domain with its permissible values (if enumerated) or description
(if not); representation class; and datatype, character set, and unit of
quantity of the values in the value domain.
Therefore, it must be abstracted by aggregation if it is to be considered
as a unit.
Combining
a property with the representation components can create a useful
construct. A logical generic element such as "height measure in feet"
can be used to record conformance criteria such as allowed range values. A narrower construct, limiting the
components to property and representation class, can be created to record
generalized conformance criteria such as that "height measure" must
only be used with units of measure with values of "feet," "inches," "meters,"
"centimeters," etc. These
would potentially be combined with object classes to produce data elements such
as "tree height measure" with a conformance criterion of "height
>0<500" (figure 3).
Another
useful object-oriented concept can be applied to allow inheritance of attribute
values between layers. This mechanism
enables the process described in the last paragraph to be applied in
many-to-one relationships: "height measure" can be applied to
"telephone pole height measure" using the same conformance criterion
as "tree height measure."
Other
combinations of components can be created as the registry designer's
discretion. Documentation by attributes
and relationships must be complete if registry content consistency is to be
maintained. Full use of generics
promotes reuse of standardized data description.
Figure
1. Abstraction from the real world to the model.

Figure 2. Abstraction of a conceptual generic element.

Figure 3. Inheritance of
component values.

REFERENCES
(INFORMATIVE)
C Codes for the representation of names of countries and their subdivisions Part 1: Country codes, International Organization for Standards (ISO), ISO 3166:1997.
C Standard representation of latitude, longitude and altitude for geographic point locations, International Organization for Standards, ISO 6709, 1983-05-15.
C Information technology programming languages, their environments and system software interfaces language-independent datatypes, International Organization for Standards, ISO/IEC 11404, First edition 1996 12-15.
C Information technology specification of data value domains, ISO/IEC TR 15452, March 1999.
5.0 REGISTER A DATA ELEMENT
Registration
of a data element in a data element registry requires that certain
characteristics of the data element are recorded to clearly describe and define
it. These characteristics are stored as
attributes of the data element. A
Registry can be used to record information about data elements ranging from
carefully crafted data standards to those found in applications. The amount and quality of metadata
information available can vary from good, complete information to poor,
incomplete information. This document
is intended to describe the population of a Registry with data elements for
which good quality, consistent metadata can be created. Part 3 of the ISO/IEC 11179 specifies
attributes for recording information about a data element in a data
registry. This document gives examples
that demonstrate the population of a data registry. It includes attributes that are mandatory and fully defined by
the metamodel, as well as those where the registration authority must establish
its own profile of required attributes.
Many
metadata registry practitioners find that using a bottom-up approach to
registering a data element is most appropriate. In many cases where a data element is submitted for registration,
only limited information (e.g., a name, definition, and a set of permissible
values) is provided. Other attributes
must be determined based on an understanding of the underlying data values and
concepts that are implied by those facts.
These are most commonly registered by means of a bottom up registration
procedure, where the basic metadata attributes about the data element (e.g.,
definition, name, and permissible values) are completed prior to defining the
conceptual information about the data element.
A bottom-up approach might also be used where the metadata registry is intended to serve as a distribution
mechanism for metadata that describes the data in data products such as public
data sets, query results, etc. The
examples provided in this report describe how to formulate attributes about a
data element, based on a bottom-up procedure.
First a general procedure for registering data elements is described,
followed by examples of registration of three types of data elements, including
data elements from:
C An international standard with an enumerated domain.
C An international standard with a non-enumerated domain.
C An information system, where the application data element uses an enumerated domain.
The
registration procedures are presented in a logical order for analyzing and
formulating attributes for a data element.
Annex F contains a table that concisely summarizes the information
registered for each data element in the examples that follow.
This
report is intended to be used to help metadata registry practitioners to
formulate the attributes that describe and define a data element. Section 5.1 presents an overall approach to
data element registration. Sections
5.2, 5.3, and 5.4 should be consulted for more specific examples of registering
the kinds of data elements described in international standards and in
information systems. Annex Y, which is
based on ISO/IEC 11179, contains more detailed information and examples to
assist the practitioner who is registering data elements.
A
top-down approach is useful in many circumstances. Although it requires more "up front" effort, top-down
registration has the potential to produce more stable and uniform metadata. An example of a top-down registration, where
registration begins with identification of conceptual domains, is provided in
informative Annex G with an example of registration of data elements about
biological organisms.
5.1 General Procedures
Often
only a limited amount of information is available about a data element that has
been submitted for registration, e.g., the name and definition contained in a
document or provided by the submitting organization and a set of permissible
values, where appropriate. The general procedures that follow are
intended to result in the registration of a complete, well-defined data element
that meets the requirements of a particular registration authority.
It
should be noted that the metadata for some data elements in a registry will
never be complete. This is true of
application data elements that are obtained from computer software, where very
little information is known except the representational attributes (e.g., field
length and datatype). For these data
elements, only the most basic attributes will be entered, and the data
element's registration status will remain incomplete.
5.1.1 Understanding the Data
Element
The first step in the
registration procedure is to gain an understanding of the data element. What kind of data will be stored in this
data element? Is there a definition or
description of the data values? Were
permissible values or examples of the data provided? Will the data values be determined by an arithmetic or
statistical procedure? What will the
data values look like; e.g., are they names or descriptions of things, numerals
to be calculated, strings of characters and numbers that are identifiers? Where documentation is inadequate to fully
understand the data element, the practitioner must consult those who represent
the source of the data element to obtain the necessary information.
The
result of this first step is an understanding of the semantic content of the
data element.
5.1.2 Content Research
Prior
to formulating attributes towards registration of a new data element, the
registrar should perform content research to determine whether a data element
is described in an existing International or National standard, or whether a
data element that has the potential for being reused exists in the registry or
a federation of registries. It is
necessary to recognize that the registration practitioner must make value
decisions when recording metadata into the metadata registry. The practitioner will determine if a data
element might be adapted to meet new requirements, or some attributes of an
existing data element (e.g., value domain, data element concept, or conceptual
domain) might be reused with the new data element. Content research should include a search of conceptual domains,
data element concepts, and value domains as well as data elements, to identify
attributes that might be relevant to the data element to be registered. If a standard data element exists that can
be used as a model to meet the particular specifications for a new purpose,
some of its attributes may be reused for registration of the new data element.
The
result of this step is confirmation that a new data element is needed, or a
decision to modify or reuse an existing data element.
5.1.3 Definition and Permissible Values
The
essential semantic content of a data element must be captured in a data element
definition. Part 4 of ISO/IEC 111179
describes rules and guidelines for formulating definitions. Part 3 identifies the attributes for
describing the domain of potentially valid (i.e., permissible) values. The permissible values for a data element
are defined as a value domain. Examples
are provided in Annex Y for formulating definitions, based on the rules and
guidelines set forth in ISO/IEC 11179-4. Annex Y also contains detailed
information about the attributes in value domains and examples of how those
attributes are used for both enumerated (i.e., established through a list) and
non-enumerated domains (i.e., specified through a formula, rule, procedure, or
reference).
Different
attributes are used depending upon whether the potentially valid values are
enumerated or non-enumerated. Each
permissible value is associated with a valid value meaning that provides
meaning to the permissible value, as described in Section 5.1.6. Each permissible value is also entered in
the registry with its begin date (i.e., the date when that permissible value
became valid for that value meaning).
End dates will also be entered, when the permissible value for a value
meaning becomes invalid.
Value
domains for non-enumerated domains must include a definition/description of the
values that are possible valid values for the data element. This report contains specific examples of
registering data elements with enumerated domains (Sections 5.2 and 5.4) and
with non-enumerated domains (Section 5.3).
5.1.4 Name and Identifiers
Part
5 of ISO/IEC 11179 gives principles for naming and identification of data
elements. Each data element registered
within a Registration Authority (RA), i.e.,
an organization authorized to register metadata, is unambiguously identified
with a unique identifier. Although the
standard does not specify the format or content of the data element identifier
(DI), the DI should carry no useful information about the data element, e.g.,
it might be a number assigned sequentially by an automated system. If the attributes of a data element change,
a new version of the data element is created and registered with a version
identifier (VI).
Since
each RA establishes it's own identification scheme, the same DI might be used
to identify a different data element in another metadata registry. Therefore, a Registration Authority
Identifier (RAI) must be established for unique identification of a data
element. Data elements registered under
the provisions of ISO/IEC 11179 are assigned an international registration data
identifier (IRDI), which is a composite of the RAI, the DI, and the VI. Part 6 of ISO/IEC 11179 describes the
requirements for a RA and the construction of a RAI. The IRDI is discussed further in Part 6.
Most
people prefer to use names when talking about a data element, rather than a
non-intelligible identifier. Therefore,
one or more names can be assigned to a data element, each associated with the
context in which the name is used. A
name can be developed for a scientific discipline, an organization, a
particular computer language, a database management system, or other
purpose. Each name is developed
according to the naming convention for the particular name context. The naming convention can vary from
"whatever you want to call it" to a highly structured name. ISO/IEC 11179, part 5 does not specify a
mandatory naming convention, but does explain how to document one. For this report, the data element names are
based on a naming convention described in Annex Y. Annex Y also expands on Part 5 of the standard by providing
examples of the use of names and name contexts.
5.1.5 Other Metadata Attributes
Other
mandatory and optional data element attributes are described in Part 3 of
ISO/IEC 11179. In addition to the
definitional attributes described in Section 5.1.3 and the identifying
attributes described in Section 5.1.4, there are administrative, relational,
classifying, and other miscellaneous attributes that serve to define and
describe a data element.
In
addition to the mandatory attributes specified by Part 3 of the standard, a RA
might establish a profile for a particular metadata registry, where some of the
attributes described as optional in the standard are mandatory for that
registry, some optional attributes are not included, and additional attributes
might be identified to extend the registry.
The
attributes that relate data elements through data element concepts (Section
5.1,5), and those that classify data elements (Section 5.1.6) are described in
subsequent sections of this report.
Many information sources do not provide information about the data
element for these categories. Some administrative information is related to
quality control, and is described in Section 5.1.8. Annex Y includes detailed information about these metadata
attributes.
For
the registration procedure described in this report, some administrative and
miscellaneous attributes are recorded at this time, including:
C Submitting organization: The submitting organization is the Office or organization that has submitted the data element for registration.
C Data Steward: The data steward is the individual who has been assigned by a submitting organization to be responsible for authorizing and maintaining one or more data elements.
C Note: A data element may have a "Note" or "Comment" that can be used to capture additional descriptive information about a data element, including usage, procedure, and other explanatory information that is not appropriate to include in the data element definition attribute.
C Example: A data element shall be registered with an example, which must be one of the permissible values for enumerated value domains or must conform to the value domain description/definition and other value attributes for non-enumerated domains.
C Origin. A data element can be associated with any kind of source, including a document, standard, system, group, partner, or message set. One source, as a minimum, must be associated with a data element to indicate the origin of information about the data element.
5.1.6 Data Element Concept
At
this stage in registering a data element, it is possible to specify conceptual
information about the data element through the data element concept. The data element concept can be thought of
as an idea or perception about something, identified and described
independently of any representation.
The data element concept may relate several data elements that record
data about that concept with different representations, e.g., names and codes
that represent provinces of Canada and share the same concept, which is
"Canadian Province Identifier."
The
data element concept is singular (only one concept is represented). It can be associated with many data
elements, including other names and codes, and it does not include a
representation class term in its name or definition. The data element concept is associated with only one Conceptual
Domain, as described in the following paragraph.
Data
element concepts are specified through a definition, an identifier, a name, and
a conceptual domain, i.e., the meanings of the possible set of valid values for
a data element, expressed without representation. The conceptual domain, "Canadian Provinces", would
include valid value meanings such as "The Canadian province of
(Alberta,......., Yukon Territory)," where each value meaning would
identify one Canadian province. Each
value meaning is entered in the registry, associated with its conceptual
domain, with its begin date (i.e., the date when that value meaning became
valid) and end (i.e., when the value meaning became invalid). Permissible values are associated with value
meanings, according to the representation defined by the value domain.
Derivation
of data element concepts and conceptual domains, including value meanings are
described in detail in Annex Y.6.
5.1.7 Classification Attributes
The classification
attributes are recorded, where appropriate, at this time. Classification helps to add information not
easily included in definitions, helps to organize the contents of a metadata
registry, and helps to provide access by supporting more meaningful
queries. Part 2 of ISO/IEC 11179
describes general categories of classification; Part 5 describes three
classified components: object class, property, and representation class.
A
metadata registry might choose to classify data elements as groups, e.g., the
group of data elements used in a mailing address, the group of data elements
used to identify chemical substances, or the group of data elements that locate
a point on the surface of the earth.
Keywords
might also be used to classify data elements, e.g., altitude, date, facility,
industrial, and organization.
5.1.8 Quality Control
Initially,
only some of the attributes will be recorded for a newly registered data
element. Such a data element will be
assigned the registration status of "incomplete." When all of the mandatory data elements have
been completed, but the quality of the metadata has not been verified, the
registration status will be "recorded." Through the quality review process, some data elements will be
determined to be "certified," and some might become
"standard." The
"standard" data element is the preferred data element to be used for
data sharing, to ensure consistent representation and understanding of the data
being communicated.
Part
6 of ISO/IEC 11179 describes the registration process and the registration
status assigned to a data element as the metadata are reviewed and quality is
improved. Many data elements might be
entered into a data registry, but only a relatively small number of them might
be assigned a "standard" registration status. Annex Y describes the assignment of
Registration and Administrative Status throughout the life cycle of a
registered data element. ISO/IEC 11179
Part 6 specifies the levels of registration status; the administrative
statuses, however, are established for each registry by the RA.
5.2 International Standard with Enumerated
Domain
This
section provides a specific example of the registration of a data element from
an international standard, where the possible valid values are itemized. The International Organization for
Standardization (ISO) 3166-1:1997(E/F), Codes for the representation of
names of countries and their subdivisions B Part 1: Country codes, is used
as the source for this example. ISO
3166:1997 is a complete revision of ISO 3166, which was first published in
1974. The names of countries in the
standard correspond to those given, in English and French, in the current
"Terminology Bulletin B Country Names," issued by the United Nations
Department of Conference Services, entitled "States Members of the United
Nations, Members of the Specialized Agencies or Parties for the Statute of the
International Court of Justice" and to those published in the
"Standard Country or Area Codes for Statistical Use" established by
the United Nations Statistics Division.
The full name is the formal title as notified by the country concerned
to the UN Secretary General.
(ISO)
3166-1:1997(E/F) cancels and replaces the fourth edition (ISO 3166:1993) and
comprises a consolidation of all changes to the lists of the fourth edition
agreed to by the ISO 3166 Maintenance Agency: ISO 3166 Maintenance Agency
Secretariat, c/o DIN Deutsches Institut für Normung e.V., Burggrafenstrasse 6,
D-10787 Berlin, Germany.
ISO
3166 includes the following domains: short country name in English, full
(official) country name in English (not provided for all countries),
2-character alphabetic code, 3-character alphabetic code, 3-character numeric
code, short country name in French, and full country name in French.
The
following paragraphs are presented in the logical order for formulating
attributes for a standard, enumerated data element, using the short
English-language country name as the example.
The table in Section 5.2.10 contains all of the metadata attributes
recorded for the enumerated data element from an international standard.
5.2.1 Understanding the Data Element
The
data element to be registered is taken from an international standard, and it
includes an authoritative conceptual domain of country identifiers for all of
the countries of the world. The short
English-language name was selected for standardization because it has the most
utility for information systems used by United States (U.S.) federal agencies
as well as the private sector. The
short form of the English-language name is used by the U.S. Postal Service
(USPS)for all outgoing international mail, in preference to any of the codes or
full names that are included in the standard.
The name is also preferred by the USPS to any names that are used
locally by a country to identify itself, e.g., Japan is recognized by the USPS
in preference to Nihon, which is the country name commonly used by that country
itself. The short form of the name in
English has been used in the development of ISO 3166 as the basis for assigning
codes to avoid, wherever possible, any reflection of a country's political
status.
The
English-language short name in the standard varies in length from four
alphabetic characters (e.g., Peru) to 44 alphabetic characters (i.e., South
Georgia and the South Sandwich Islands).
The names use the English language alphabet for their character set.
5.2.2 Content Research
Other standards that contain
conceptual domains for country identification include U.S. Federal Information
Processing Standards (FIPS), published by the U.S. Department of Commerce,
Technology Administration, National Institute of Standards and Technology
(NIST). FIPS 10-4 is maintained by the
Office of the Geographer and Global Issues, U.S. Department of State. It is intended for use in activities by the
Department of State and national defense programs, and can also be used for
Federal interchanges of information with the non-Federal sector of the
U.S. FIPS 10-4, published in April
1995, reflects changes through May 6, 1993.
FIPS 104-1 implements an American National Standards Institute standard
ANSI Z39.27-1984, and adopts, with qualifications, entities, names, and codes
prescribed by ISO 3166. FIPS 104-1 was
last updated on May 12, 1986. The
maintenance organization is the National Bureau of Standards (now NIST) in
coordination with the U.S. Department of State, the U.S. Board of Geographic
Names, and the maintenance organization for ISO 3166. There are no known plans to update either of the FIPS standards,
and neither of these standards is recognized internationally.
An
authoritative international source of value domains which has ongoing
maintenance is a necessity for maintaining data values for the data elements
identifying countries of the world.
Therefore, the ISO 3166:1997 is used as the origin of the data element
for country name.
5.2.3 Definition and Permissible Values
The
definition and permissible values are the most important metadata attributes in
uniquely describing a data element.
5.2.3.1 Definition
Understanding that the
essential meaning of this data element is to identify countries using a short
name in the English-language, the data element definition can be formulated as
"The short name of a country, represented in the English language." This definition is formulated using the
mandatory rules and guidelines established in ISO/IEC 11179-4. The rules and guidelines from Part 4 are
described with examples in Annex Y.2. The definition is singular, since any
instance of the data element contains only one value.
5.2.3.2 Permissible Values
The permissible values for
the data element are the short names in English, listed in ISO 3166 (e.g.,
Afghanistan, Albania, ......., Zimbabwe). Each permissible value is entered
into the registry with the date when that permissible value was valid for that
value domain (in this case the date is January 10, 1997, the same as the begin
date for the value meaning). There is
no end date to enter at this time.
The scope of the permissible
values for this data element includes the short English-language name for all
countries. A value domain is defined as
the permissible values for a data element.
For this example, the value domain is described as "All short,
English-language names of all countries."
Note that Part 3 of ISO/IEC 11179 does not require a description or
definition for enumerated domains. Some
RA, however, prefer that all value domains be registered with a
description/definition. Record the
other value domain attributes for this example at this time, including:
C Character Set: The character set for Short English-Language Country Name is "English language."
C Domain Type: Country names are a fixed list of countries, maintained by international standards; therefore, the domain type is "enumerated."
C Datatype: The datatype for country name is "alphanumeric."
C Maximum and minimum field lengths: Based on prior research (Section 5.2.1), the minimum length for values for the data element is known to be four. The known maximum length for names in the current standard is 44. The maximum field length, however, is set to 60, to accommodate any changes or additions to the domain of values.
C Format: The format selected by the registration authority for this example is A(60) to accommodate the longest of the English-language short names.
5.2.4 Identify and Name the Data Element
Name do not identify a data
element. Identification requires a
unique identifier, preferably one that does not contain information about the
data element. The name provides a
designator so that users of the registry have terms by which they refer to the
data element.
5.2.4.1 Identification
Assign a unique identifier
to the data element for short English-language country name, as described in
Annex Y for the identification of data elements. In the metadata registry for
this example, a unique DI and VI (20903:1) are assigned by the computer at the
time of registry.
5.2.4.2 Name Context and
Naming Convention
ISO/IEC 11179 Part 5
describes the naming of data elements.
Annex Y gives examples of name contexts and naming conventions. For this international standard data element,
the name is assigned the context of "Registry," and it is derived
based on the example naming convention provided in Annex Y and summarized as
follows:
Scope: The scope of this example naming convention is Registry Name.
$ Authority: The authority for this example is the U.S. Environmental Protection Agency for its Environmental Data Registry.
$ Semantic Rules: Names shall include an object and a property, where appropriate. Qualifiers shall be used to differentiate between names that would otherwise be the same. The representation class term shall always be included as the last term in the name.
$ Lexical Rules: A data element name shall have a maximum of 100 alphanumeric characters. The language of the registry shall be English, and the character set ASCII. There are no controlled word lists.
$ Name Uniqueness: Names shall be unique within a registration authority.
5.2.4.3 Name the Data
Element
Using the above naming
convention, the name is entered with the context of "Registry." The convention specifies that the name
should include the object "Country", to indicate the data values to
be stored in the data element. The name
should also include the representation for the concept, in this example
"Name." For this particular
example, it is necessary to qualify the name, since there are four value
domains of country names in the ISO 3166 standard. The qualifiers: "short" and
"English-language" are appropriate to this example. The name that has been formulated for this
data element, therefore, is "Short English-Language Country
Name."
5.2.5 Other Metadata Attributes
Other metadata attributes
that can be recorded at this time are:
C Select the example for this data element; it must be one of the permissible values in the value domain.
Example: China
C Identify the origin for this data element as the standard from which the permissible values are obtained.
Origin: ISO 3166-1:1997, Codes for the
representation of names of countries and their subdivisions - Part 1: Country
codes (Document)
C Record any notes or comments that might provide additional information about the data element that is not included in the definition.
Note: This data element is included in the EPA
revised interim Facility Identification Standard.
C Enter the name of the submitting organization, which is the Office that submitted the data element for registration.
Submitting Organization:
Office of Information Resources Management
C Record the name of the individual or organization assigned the responsibility for monitoring and maintaining the data element as the data steward.
Data Steward: Marian Cody
C Administrative metadata, such as Create Date and User Name are recorded or captured automatically by the system where applicable.
5.2.6 Data Element Concept
Identification of the data
element concept, as described in Section 5.1.6 is based on the data element
name and definition, without the representation. The concept represented by the data element "Short
English-Language Country Name" is "Country Identifier," defined
as "An identifier for a primary geopolitical entity of the world." This concept can be represented by all seven
of the names and codes included in ISO 3166.
The conceptual domain is a
collection of value meanings that provide meaning to the permissible values for
a data element. The conceptual domain
that contains value meanings related to the identity of countries of the world
is named "Countries of the World."
It is defined as "The primary geopolitical entities of the
world." The value meanings
associated with this conceptual domain are defined as "The primary
geopolitical entity of the world known as <country name>,"
where country name is one of the country names listed in ISO 3166. Each value meaning is identified by its own
value meaning identifier (VMID) and each is entered into the registry with the
date when that value meaning was entered into the conceptual domain (in this
case the date is January 10, 1997). End
dates will also be entered, when the value meaning becomes invalid (e.g., when
a country name changes or the territory of a country changes to be combined
with another country or to be subdivided into two or more other
countries).
5.2.7 Classification
This data element might be
classified according to the following classification schemes:
· Identify one or more
keywords, where the keyword is a name or subject matter descriptor that will
facilitate grouping like data elements for retrieval.
Keyword: Country.
· Group Short,
English-Language Country Name with similar data elements according to concept
for translation or by general subject matter.
Conceptual group: Country
Identifiers
Subject group:
Geopolitical Entities.
· Identify the class by which
this data element is represented.
Representation Class: Name
$ One or more real world objects that identify this data element can be identified at this time.
Object: Country
5.2.8 Quality Control
When all of the mandatory
metadata attributes have been entered for this data element, it is assigned the
Registration Status of "Recorded" and the administrative status of
"In Quality Review." Because
the data element was identified by an international standard, and it is
expected to be the preferred data element for representing country name within
the example metadata registry, the registration status will be updated to
AStandard@ with administrative status AFinal@ -after the necessary quality
review has been completed.
5.2.9 Other Codes and Names from ISO 3166
Other codes, official
English names, and French names (both official and short) from ISO 3166 are
registered with their individual value domains, representation, data element
definitions, and data element names.
All of the data elements associated with ISO 3166 will share the same
data element concept (i.e., Country Identifier, defined as "An identifier
for a primary geopolitical entity of the world.") and the same conceptual
domain (i.e., Countries of the World, defined as "The primary geopolitical
entities of the world."). All of
the ISO 3166 data elements will share the same value meanings. They will, however, have different sets of
permissible values associated with the value meanings, depending upon the data
element, its representation, and its value domain.
5.2.10 Summary of Attributes
The metadata attributes that
have been assigned to this data element, the short, English-language country
name identified by the ISO 3166:1997 standard, are summarized in the following
table, and in the first column of the table in Annex F.
|
Data Element Meta--
Example model
Attribute Name |
ISO 3166 Enumerated, Name |
||
|
1. Data
Element Definition and Permissible Values |
|||
|
|
Data Element Definition Context |
Registry |
|
|
|
Data Element Definition |
The English-language short name of a country. |
|
|
|
Permissible Values |
All English-Language Short Country Names from ISO
3166, matched with value meanings.
(Afghanistan, Albania,......, Zimbabwe) |
|
|
|
PV Begin Date |
19971001 |
|
|
|
PV End Date |
(Not Applicable) |
|
|
|
Value Domain Definition |
All English-language short names of all countries. |
|
|
|
Character Set |
English language |
|
|
|
Domain type |
Enumerated |
|
|
|
Determinant Type |
(Not Applicable) |
|
|
Range Limits |
(Not Applicable) |
||
|
|
Datatype |
Alphanumeric |
|
|
|
Minimum |
4 |
|
|
|
Maximum |
44 |
|
|
|
Format |
A(60) |
|
|
|
Unit of Measure |
(Not Applicable) |
|
|
|
Precision |
(Not Applicable) |
|
|
2. Data
Element Name and Identifier |
|||
|
|
Data Element Name Context |
Registry |
|
|
Data Element Name |
Short English-Language Country Name |
||
|
|
DE Identifier/ Version
Number (DI:VI) |
20903:1 |
|
|
3. Other
Metadata Attributes |
|||
|
|
Example |
China |
|
|
|
Origin |
ISO 3166-1:1997, Codes for the representation of
names of countries and their subdivisions B Part 1: Country codes (Document) |
|
|
Note/Description |
This data element is included in the EPA revised
interim Facility Identification Standard. |
||
|
Submitting organization |
Office of Information Resources Management |
||
|
Data Steward |
Marion Cody |
||
|
4. Data
Element Concept (DEC) |
|||
|
|
Data Element Concept Name |
Country Identifier |
|
|
|
Data Element Concept Definition |
An identifier for a primary geopolitical entity of
the world. |
|
|
|
Conceptual Domain Name |
Countries of the World |
|
|
|
Conceptual Domain Definition |
The primary geopolitical entities of the world. |
|
|
|
Enumerated Value Meaning Text |
The primary geopolitical entity known as <China>. |
|
|
VM Begin Date |
19971001 |
||
|
VM End Date |
(Not Applicable) |
||
|
Classification |
|
||
|
|
Keyword |
Country |
|
|
|
Group |
Country Identifiers, Geopolitical Entities |
|
|
|
Representation Class |
Name |
|
|
|
Object |
Country |
|
|
Quality Control |
|||
|
|
Registration Status |
Standard |
|
|
|
Administrative Status |
Final |
|
5.3 International Standard with Non-Enumerated Domain
This section provides a
specific example of the registration of a data element from an international
standard, where the possible valid values are not enumerated, but must be
determined by a procedure. The International
Organization for Standardization (ISO) 6709-1983 (E), Standard
representation of latitude, longitude and altitude for geographic point
locations, is used as the source for this example. ISO 6709 was developed by ISO Technical
Committee ISO/TC 97, Information processing systems, and was circulated to
member bodies in November 1981.
Eighteen countries approved the standard, no member body expressed
disapproval. There is no known schedule
for review and update of the standard.
ISO/TC 32 has been assigned as the maintenance authority for the
standard; ISO/TC 211 has expressed an interest in assuming responsibility for
its maintenance.
The table in Section 5.3.10
contains all of the metadata attributes recorded for the non-enumerated data
element from an international standard.
5.3.1 Understanding the Data Element
Latitude is a measure of the
angular distance on a meridian north or south of the equator. The standard provides for a variable format
and more than one representation for recording the latitude measure (i.e.,
degrees and decimal degrees and sexagesimal [i.e., degrees, minutes, and
seconds. The standard also includes
more than one representation and format for longitude, and a flexible format
for altitude. In addition, a standard
format for data transfer is included in the standard.
Although new technology and
new tools (e.g. Global Positioning System [GPS]) and analytical and mapping
software have caused some geographic information specialists to prefer the
measurement of locational coordinates in degrees and decimal degrees, many
organizations continue to measure latitude and longitude in degrees, minutes,
and seconds. Therefore, the RA of the
metadata registry in this example, has determined a need to register a data
element for latitude measured in degrees, minutes, and seconds. According to the standard, the placement of
the decimal point indicates the transition from degrees to sexagesimal
measures. Examples of data in the standard
include sexagesimal latitudes that are measured to a range of one or two
decimal places for seconds. The
standard, however, does not limit the precision, but requires only that the
number of decimal places indicate the precision of the measurement. The RA for this example requires that
latitude be recorded up to 5 decimal positions, where it can be measured to
that level of precision.
Latitude values are measured
in a range of 0 (on the equator) to 90 degrees. Minutes and seconds each are measured in a range of 0 to 60. Latitude values on or North of the equator
are recorded as positive numbers; those South of the equator are negative. Where latitude degrees are measured in
single digit, they must be recorded with a preceding zero. For data transfer, latitude measures must be
preceded by the directional symbol (+ or -), and they must include decimal
point, when the measurement includes decimal seconds. Latitude always precedes longitude, which precedes altitude. The latitude and longitude must be expressed
in the same format style and to the same precision (indicated by the number of
decimal positions). There are no
separators between the latitude, longitude, and altitude; the directional
symbol serves as a separator for the data element values.
5.3.2 Content Research
Part 11 of ISO 15046,
Spatial referencing by coordinates, describes the minimum data required to
define 1-, 2-, and 3-dimensional coordinate reference systems. The coordinate reference system must be
fully defined for a position to be unambiguous. Knowledge of the reference system is necessary to determine if
coordinate points are comparable. The
standard does not, however, provide information about representation of the
coordinates. ISO/TC 211/ WG 3, the
workgroup that is currently revising ISO 15046, has expressed an interest in
revising (ISO) 6709-1983 (E), Standard representation of latitude, longitude
and altitude for geographic point locations. Because of TC211=s interest in ISO 6709, and their current work
on the closely related standard, ISO 15046, it seems likely that ISO 6709 will
soon be reviewed and updated if needed.
Therefore, ISO 6709 seems appropriate to be identified as a standard data
element for latitude measure where latitude is measured as sexagesimal (i.e., in degrees, minutes, and
seconds).
A search of the metadata
registry in our example reveals about 40 data elements related to latitude
measure. One, an EPA interim standard
for latitude, measured in degrees and decimal degrees, is compliant with the
ISO 6709 data element for degrees. None
of the other data elements has the potential for compliance with ISO 6709 for
sexagesimal measure of latitude. The
other latitude data elements in the registry have been assigned the
registration status of incomplete, and many data elements are qualified (e.g.,
latitude where a facility is located, latitude of a smoke stack). For the purpose of this example, none have
the potential for being modified to meet the requirements of the ISO 6709
standard for latitude, measured in degrees, minutes, and seconds.
Therefore, in this example,
the ISO 6709 latitude, sexagesimal
measure, is selected for registration as a new data element.
5.3.3 Definition and Permissible Values
5.3.3.1 Definition
The data element definition
is formulated according to the rules and guidelines described in Annex Y, based
on ISO/IEC 11179-4. The rules require
that a data element definition be unique within the registry, so the unit of
measure has been included in the definition as "The sexagesimal measure of
the angular distance on a meridian north or south of the equator." Including the unit of measure in the
definition distinguishes the data element from the EPA interim standard,
defined simply as "The measure of the angular distance on a meridian north
or south of the equator." The
definition is singular, because it refers to only one instance of the data
value. Note that ISO 6709 does not
include a definition for latitude.
5.3.3.2 Permissible Values
ISO 6709 is an international
standard that does not list specific values that are valid for the data
element; the measure of latitude is a non-enumerated domain. There are no stored permissible values in a
registry for non-enumerated domains.
The values that are permissible for the ISO 6709 sexagesimal latitude
data element are those values that conform to the definition of the value domain
and the attributes for datatype, format, unit of measure, and precision. The value domain for sexagesimal latitude
can be described as "All sexagesimal measures of the distance of an angle
north or south of the equator." By
including the unit of measure in the definition, the value domain is
distinguished from the value domain description for latitude measured in
degrees. The definition is plural,
because it includes all possible measurements of latitude determined by this
type of measurement.
Latitude values that are
measured as degrees, minutes, and seconds must conform to the format +/‑DDMMSS
to +/-DDMMSS.SSSSS. The precision of
the value is indicated by the number of decimal places recorded.
Other value domain
attributes for this example include:
$ Character Set. The character set for latitude measure is "English language."
$ Domain Type. Non-enumerated.
$ Description/definition. All sexagesimal measures of the distance of an angle north or south of the equator.
$ Datatype. The datatype for latitude measure is "alphanumeric" to explicitly include the directional symbol and decimal point, where appropriate.
$ Maximum and minimum field lengths. The known minimum field length at this time is seven (+/- DDMMSS) where no decimal seconds are recorded. The maximum field length is 13 (+/- DDMMSS.sssss), to accommodate up to five decimal places for seconds.
$ Format. The format selected by the registration authority for this example is A(13) to accommodate the maximum number of decimal positions.
$ Range for degrees is 0-90; for minutes is 0-60; for seconds is 0-60.
5.3.4 Identifying and Naming the Data Element
5.3.4.1 Identifiers
A unique identifier is
required for the latitude data element.
For the RA in this example, the DI and VI (312345:1) are assigned
automatically by the metadata registry software.
5.3.4.2 Name Context and
Naming Convention
For this ISO standard data
element, the name is assigned with the context of Registry, using the naming
convention described in the example in Annex Y, summarized as follows:
$ Scope: The scope of this example naming convention is Registry Name.
$ Authority: The authority for this example is the U.S. Environmental Protection Agency for its Environmental Data Registry.
$ Semantic Rules: Names shall include an object and a property, where appropriate. Qualifiers shall be used to differentiate between names that would otherwise be the same. The representation class term shall always be included as the last term in the name.
$ Lexical Rules: A data element name shall have a maximum of 100 alphanumeric characters. The language of the registry shall be English, and the character set ASCII. There are no controlled word lists.
$ Name Uniqueness: Names shall be unique within a registration authority.
5.3.4.3 Name the Data
Element
Using the above naming
convention, the name is entered with the context of "Registry." The convention specifies that the name
should include the object "Latitude", to indicate the data values to
be stored in the data element. Include the representation for the concept in
the name; in this example "Measure."
There is no requirement in ISO/IEC 11179 Part 5 that data element names
be unique in a registry. However, the
naming convention used in this example specifies that names must be unique
within a registry. It is advisable to
use a qualifier in the data element name to differentiate between data elements
that might otherwise have the same name. The name includes the object
(latitude) and the representation (measure).
For this example, the name of the latitude data element will carry the
qualifier "sexagesimal" as a discriminator. The name that has been derived for the latitude data element is
"Latitude Sexagesimal Measure."
5.3.5 Other Metadata Attributes
Other metadata attributes
that can be recorded at this time are:
$ Provide an example of the data value that conforms to the description in the value domain, and to the datatype, format, and other value domain attributes for this data element.
Example: +674532 and
+674531.85435
$ Record the origin of this data element as the standard where the data element was identified.
Origin: ISO 6709-1983 (E), Standard
representation of latitude, longitude and altitude for geographic point
locations.
$ Record notes and comments that contain additional information about the data element that is not appropriate for the definition.
Note: Latitude sexagesimal
converts to latitude degrees by the following formula: seconds x 60 = decimal
minutes, total minutes x 60 = decimal degrees.
$ List the Office that submitted the data element for registration as the submitting organization.
Submitting Organization:
Office of Information Resources Management
$ The organization or individual that has responsibility for maintaining and updating the data element is recorded as the data steward for that data element.
Data Steward: Larry
Fitzwater
$ Administrative metadata, such as Create Date and User Name are recorded or captured automatically by the system where applicable.
5.3.6 Data Element Concept
The methodology to be used
for deriving a data element concept is described in Section 5.1.6 and Annex Y
of this document. A data element
concept is the data element without representation. We have indicated previously
that latitude is a distance measure, where measure is its representation. The data element concept for latitude
measure is "Latitude Distance" with the definition "A measure of
the angular distance of a point on the surface of the earth north or south of
the equator." Note that this
concept definition incorporates the
term "measure," which is a representation term. The concept of latitude, however, is the
measure of a distance. Therefore, it is
appropriate in this instance to use the term measure when defining the
concept.
A conceptual domain is a
collection of value meanings. The
collection must be identified with a name and a definition. The latitude is one of the horizontal
coordinates that fix a position on the surface of the earth either north or
south of the equator. For this example,
the name of the conceptual domain for latitude measure is "Latitude
Coordinates" with the definition "The coordinates that indicate the
distance north or south of the equator for locations."
For non-enumerated domains,
such as latitude measure, the value meanings are not explicitly
identified. The conceptual domain for
the Latitude Distance data element concept is the perceived repository of all
latitudes that mark positions on the earth with relation to the equator. The value meanings could be defined as
"The distance measure of a point north or south of the equator that is
<value>." No value meanings
are stored in the registry.
5.3.7 Classification
This data element might be
classified according to the following classification schemes:
$ Identify one or more keywords, where the keyword is a name or subject matter descriptor that will facilitate grouping like data elements for retrieval.
Keyword: Latitude,
Horizontal Coordinate, Spatial
$ Group Short, English-Language Country Name with similar
data elements according to concept for translation or by general subject
matter.
Subject group: Geographic
Point Location.
$ Identify the class by which this data element is represented.
Representation Class:
Measure
$ One or more real world objects that identify this data element can be identified at this time.
Object: Latitude
5.3.8 Quality Control
When all of the mandatory
metadata attributes have been entered for this data element, it is assigned the
registration status of "Recorded" and the administrative status of
"In Quality Review." This
data element was identified in an international standard, and so would soon be
updated to reflect higher status of the data element. The data element, however, would not be expected to be assigned
the status of AStandard.@ The data
element is not expected to be come the preferred representation for latitude
measure, since geographic information specialists prefer that latitude and
longitude be recorded in degrees and decimal degrees. Therefore, after quality review has been completed, the data
element will be assigned the registration status of ACertified@ with an
administrative status of ANo further action.@
5.3.9 Other Data Elements in ISO 6709
ISO 6709 identifies five
data elements: sexagesimal latitude, degrees latitude, sexagesimal longitude,
degrees longitude, and altitude. The
different formats represented by the units of measure for latitude (i.e.,
degrees and sexagesimal) express representation (i.e., unit of measure). The two latitude data elements from ANSI
6709 are translatable at the concept level, based on their unit of measure
representations. They share the same
conceptual domain, because their implied value meanings are the same. Likewise, the longitude data elements share
a data element concept and a conceptual domain, and longitude data can be
translated based on unit of measure conversions. .
Whereas the multiple data elements
identified in ISO 3166 share the same data element concept and the same
conceptual domain, the data elements identified in ISO do not share data
element concepts and conceptual domains. All three concepts: latitude,
longitude, and altitude, are distance measures. Latitude, however, is a north/south measure with respect to the
equator; longitude is an east/west measure with respect to the prime meridian;
and altitude is a vertical measure with respect to a point of reference such as
sea level. Each has its own data
element concept and its own conceptual domain.
These data elements do share
classification. All can be classified
as the group "Geographic Point Location" and as the representation
class "Measure."
5.3.10 Summary of Metadata Attributes
The following table
summarizes the metadata attributes assigned to latitude sexagesimal measure in
the preceding paragraphs in Section 5.3.
The table in Annex F also contains this data in the second metadata
column.
|
Data Element Meta--
Example model
Attribute Name |
ISO 6709 Non-enumerated, Latitude |
||
|
1. Data
Element Definition and Permissible Values |
|||
|
|
Data Element Definition |
The measure in degrees of the angular distance of a
position on earth on a meridian north or south of the equator. |
|
|
|
Permissible Values |
Measures of Latitude in Degrees, Minutes, and Seconds |
|
|
|
PV Begin Date |
(Not Applicable) |
|
|
|
PV End Date |
(Not Applicable) |
|
|
|
Value Domain Definition |
All measures of the distance of an angle north or
south of the equator measured in degrees, minutes, and seconds. |
|
|
|
Character Set |
English language |
|
|
|
Domain type |
Non-enumerated |
|
|
|
Determinant Type |
Range |
|
|
Range Limits |
00-90 |
||
|
|
Datatype |
Alphanumeric |
|
|
|
Minimum |
7 |
|
|
|
Maximum |
13 |
|
|
|
Format |
A(13)
+/-DDMMSS.SSSSS |
|
|
|
Unit of Measure |
Sexagesimal |
|
|
|
Precision |
Number of decimal places recorded. |
|
|
2. Data
Element Name and Identifier |
|||
|
|
Data Element Name Context |
Registry |
|
|
Data Element Name |
Latitude Sexagesimal Measure |
||
|
|
DE Identifier/ Version
Number (DI:VI) |
312345:1 |
|
|
3. Other
Metadata Attributes |
|
||
|
|
Example |
+674532 and +674531.85435 |
|
|
|
Origin |
ISO 6709-1983 (E), Standard representation of
latitude, longitude and altitude for geographic point locations. |
|
|
Note/Description |
Latitude on or north of the equator is preceded by a
plus sign; south of the equator by a minus sign. |
||
|
Submitting organization |
Office of Information Resources Management |
||
|
Data Steward |
Larry Fitzwater |
||
|
4. Data
Element Concept (DEC) |
|
||
|
|
Data Element Concept Name |
Latitude Distance |
|
|
|
Data Element Concept Definition |
A measure of the angular distance of a point on the
surface of the earth north or south of the equator |
|
|
|
Conceptual Domain Name |
Latitude Coordinates |
|
|
|
Conceptual Domain Definition |
The coordinates that indicate the distance north or
south of the equator for locations. . |
|
|
|
Enumerated Value Meaning Text |
(Not Applicable) |
|
|
VM Begin Date |
(Not Applicable) |
||
|
VM End Date |
(Not Applicable) |
||
|
5. Classification |
|
||
|
|
Keyword |
Horizontal Coordinate, Latitude |
|
|
|
Group |
Geographic Point Locations |
|
|
|
Representation Class |
Measure |
|
|
|
Object |
Latitude |
|
|
6. Quality Control |
|
||
|
|
Registration Status |
Recorded |
|
|
|
Administrative Status |
In quality review |
|
5.4 Application Data Element
Application data elements
are data elements that are used for a particular application. For this report, an application data
element, such as is found in a computer system application has been identified
as an example for data registration.
Data elements used in computer systems are associated with an entity
(e.g., table) and might be identified with a qualifier. The country name attribute in the mailing
address entity has been selected from an information management system that
contains data about facilities (i.e., the Facility Data System). This data element was selected to illustrate
the relationship between an application data element and a standard data
element with the same data values. It
also illustrates how a well defined data element might differ from one that is
identified from a computer application system.
The methodology is the same as that described in Sections 5.1. It should be noted that many computer
application systems contain metadata that is incomplete. Often, only the data element name, the data
type and the field length are known about a data element. Where data elements can be attributed, as
in this example, where the data element can reuse domain and conceptual
information, based on a standard data element, the data element can be
registered as Recorded. Many data
elements, however, must be registered as Incomplete, and all metadata
attributes identified as Mandatory, might never be complete.
The table in Section 5.4.10
contains a summary of all the metadata for the application data element
described in this report .
5.4.1 Understanding the Data Element
The application data element
for country code, used in a mailing address, must be capable of being used on a
mail piece for delivery of mail to any country throughout the world. The country must be represented in such a
way that it is easily read and conforms to a known identifier for that country.
Therefore, authoritative names of all countries must be included in the value
domain. The name must be of a length
that will fit on one line of the address block.
5.4.2 Content Research
The United States Postal
Service mailing address standard requires that the country name be included as
the last line of a mail piece. Before
registering a data element for the country name used in a mailing address the
metadata registry for the RA is examined to determine if there is a data
element, value domain or permissible values, or data element concept and
conceptual domain that might be reused in attributing this data element.
A search of the registry
will find that a standard data element has been registered, based on the
international standard ISO 3166. The
standard data element is not specific enough to describe the application of the
data element in a mailing address entity.
The appropriate value domain for country name to be used in a mailing
address, however, should be the short name from the ISO 3166 standard. All value domain information for this
application data element (i.e., country name used in a mailing address) is the
same as for the ISO standard Short English-Language Country Name, described in
Section 5.2, and the conceptual domain for this data element is the same. Therefore, the data element is registered,
reusing domain information from the standard data element.
5.4.3 Definition and Permissible Values
5.4.3.1 Definition
The definition for the
country name attribute in the mailing address entity is formulated according to
the rules and guidelines listed in ISO/IEC 11179-4. The rules and guidelines are provided in Annex Y of this
document, with additional examples that will provide assistance in formulation
the definition. Because this data
element has been submitted through a computer application system (i.e., the
Facility Data System), the definition provided by the application system is
retained, identified by the context for the system. Name Context for this applicaiton data element is described in
Section 5.4.4.2. Definitions may be
entered into the registry in conjunction with the context used for the data
element name. The definition with the
context for the Facility Data System is "The name of a country where the
addressee is located." The
Registry name context definition includes the concepts for country identifier,
mailing address, and representation.
The rules and guidelines specified in ISO/IEC 11179-4 are used to
formulate the data element definition as "The name of the country where a
mail piece is delivered."
5.4.3.2 Permissible Values
The permissible values for a
data element are determined by the value domain. The application data element for mailing address country name
uses the same permissible values as the standard data element for
English-language short country names listed in the ISO 3166 standard (e.g.,
Afghanistan, Albania, ......., Zimbabwe).
Each permissible value is entered into the registry with the date when
that permissible value was valid for that value domain (in this case the date
is January 10, 1997, the same as the begin date for the value meaning). There is no end date to enter at this time.
The scope of the permissible
values for this data element includes the short English-language name for all countries. The value domain is described as "All
short, English-language names of all countries." Note that Part 3 of ISO/IEC 11179 does not require a description
or definition for enumerated domains.
Some RA, however, prefer that all value domains be registered with a
description/definition. Record the
other value domain attributes for this example at this time, including:
$ Character Set: The character set for Short English-Language Country Names is "English language."
$ Domain Type: Country names are a fixed list of countries, maintained by international standards; therefore, the domain type is "enumerated."
$ Datatype: The datatype for country name is "alphanumeric."
$ Maximum and minimum field lengths: Based on prior research (Section 5.2.1), the minimum length for values for the data element is known to be four. The known maximum length for names in the current standard is 44. The maximum field length, however, is set to 60, to accommodate any changes or additions to the domain of values.
$ Format: The format selected by the registration authority for this example is A(60) to accommodate the longest of the English-language short names.
5.4.4 Identify and Name the Data Element
5.4.4.1 Identification
For this example, the data
element for the country name used in a mailing address is assigned a unique
data identifier (DI) and version identifier (VI) (5394:1) by the computer
application software when it is entered into the metadata registry.
5.4.4.2 Name Context and
Naming Convention
This data element is
assigned two names, each with its own context.
First is the system name context, since this data element was identified
as contained in an application system, and retaining the name used by the
system is valuable for documenting the system.
The naming convention that has been established for this application
system is as follows:
$ Scope: The scope of this example naming convention is application data elements in the Facility Data System.
$ Authority: The authority for this example is the U.S. Environmental Protection Agency for its Environmental Data Registry
$ Semantic Rules: Names shall be the same as those used by the application software, using the convention of Entity Name.Attribute Name.
$ Lexical Rules: A data element name shall have a maximum of 200 alphanumeric characters. The language of the registry shall be English, and the character set ASCII. There are no controlled word lists.
$ Name Uniqueness: Names shall be unique within a registration authority for the entity.attribute relationship.
The second name to be
assigned to this data element is the registry name. It follows the naming convention for registry name context, as
described in Annex Y.
$ Scope: The scope of this example naming convention is Registry Name.
$ Authority: The authority for this example is the U.S. Environmental Protection Agency for its Environmental Data Registry.
$ Semantic Rules: Names shall include an object and a property, where appropriate. Qualifiers shall be used to differentiate between names that would otherwise be the same. The representation class term shall always be included as the last term in the name.
$ Lexical Rules: A data element name shall have a maximum of 100 alphanumeric characters. The language of the registry shall be English, and the character set ASCII. There are no controlled word lists.
$ Name Uniqueness: Names shall be unique within a registration authority.
5.4.4.3 Name the Data
Element
When documenting an
application system, it is important to know the name of the system and the
entity in which the data element exists as an attribute. This data element is assigned a name for the
context "Facility Data System."
It is also valuable to know the name of the attribute in that
system. For this example, the system
name is Facility Data System, which is documented in the registry as a
system. The name of the attribute in
the system is Country_Name, and the entity name is Mailing_Address. Therefore, the data element name for the
context Facility Data System is Mailing_Address.Country_Name.
The data element name with
Registry as context should identify the data values to be contained in the
value domain (i.e., country) and the entity (i.e., address) associated with the
data element. It should also include
the name of the representation class.
For the application data element (e.g., country name in a mailing
address entity) the entity is "address" qualified by
"mailing." The data values and representation are the same as for the
ISO standard data element.
The qualifier is
appropriate, since the registry might also have an application data element
that designates the country name in a geographic (i.e., physical location)
address entity. The qualifier is needed
to discriminate between the country name in mailing and geographic
addresses. The guidelines described in
Section 5.1.3 should be followed. The
registry name of this data element, based on ISO/IEC 111779-5 guidelines is
"Mailing Address Country Name."
5.4.5 Other Metadata Attributes
Other metadata attributes
that can be recorded at this time are:
$ Select the example for this data element; it must be one of the permissible values in the value domain.
Example: China
$ Identify the origin for this data element as the standard from which the permissible values are obtained.
Origin: Facility Data
System, Environmental Protection Agency, Office of Enforcement and Compliance
Assessment.
$ Record any notes or comments that might provide additional information about the data element that is not included in the definition.
Note: The country name is
always located as the last line of a mail piece for international
mailings.
$ Enter the name of the submitting organization, which is the Office that submitted the data element for registration.
Submitting Organization:
Office of Enforcement and Compliance Assessment
$ Record the name of the individual or organization assigned the responsibility for monitoring and maintaining the data element as the data steward.
Data Steward: James Jones
$ Administrative metadata, such as Create Date and User Name are recorded or captured automatically by the system where applicable.
5.4.6 Data Element Concept
The data element concept for
this data element includes the object class (entity) of address, as well as the
property of being a country identifier.
It does not include the qualifier for "mailing." This data
element concept is not the same as the concept for the standard Country Short
Name data element, which is limited to the concept of country identifier. The name of this data element concept, following
the guidelines described in Section 5.1.6, is "Address Country
Identifier" and the data element concept definition is "An identifier
for an address of a primary geopolitical entity of the world." This data element concept could be reused
for other address country identifiers, such as a geographic address country
name, a geographic country code, or other representations and data element
qualifiers.
The conceptual domain for
this application data element is the conceptual domain for all the countries of
the world. It uses the same value
meanings and the same permissible values as the standard data element for
country name. Therefore it reuses the
conceptual domain and the value domain that were established for the ISO
standard, Short English-Language Name.
5.4.7 Classification
This data element might be
classified according to the following classification schemes:
$ Identify one or more keywords, where the keyword is a name or subject matter descriptor that will facilitate grouping like data elements for retrieval.
Keyword: Country, Mailing
Address
$ Group the mailing address country name with similar data elements according to concept for translation or by general subject matter.
Subject group: Mailing
Address
$ Identify the class by which this data element is represented.
Representation Class: Name
$ One or more real world objects that identify this data element can be identified at this time.
Object: Country, Mailing
Address
5.4.8 Quality Control
When all of the mandatory
metadata attributes have been entered for this data element, it is assigned the
registration status of "Recorded" and the administrative status of
"In Quality Review." This
data element was identified by an application, and so would often not be
completely attributed. This application
data element, however, has been completed by reusing the value domain,
permissible values, and conceptual domain of a standard data element, and so
can be entered with a registration status of Recorded.
5.4.9 Related Data Elements
Data elements related to
this application data element for Country Name are other data elements that are
used in the mailing address entity, including such data elements as street name
or other delivery point, city or other jurisdictional name, state or province
name or code, and ZIP+4 code or other international postal code. None of these share the same value domains,
conceptual domains, or permissible values. The data elements, however, can be classified as a group that make
up the Mailing Address entity.
5.4.10 Summary of Metadata Attributes
The following table contains
a summary of the values assigned to the metadata attributes in the preceding
paragraphs of Section 5.4. The table in
Annex F also contains this metadata.
|
Data Element Meta--
Example model
Attribute Name |
Application Enumerated, (System Reference) |
|||
|
1. Data
Element Definition and Permissible Values |
||||
|
|
Data Element Definition Context |
Registry |
Facility Data System |
|
|
|
Data Element Definition |
The name of the country where a mail piece is
delivered. |
The name of a country where the addressee is located. |
|
|
|
Permissible Values |
All English-Language Short Country Names from ISO
3166, matched with value meanings.
(Afghanistan, Albania,......, Zimbabwe) |
||
|
|
PV Begin Date |
19971001 |
||
|
|
PV End Date |
(Not Applicable) |
||
|
|
Value Domain Definition |
All English-language short names of all countries. |
||
|
|
Character Set |
English language |
||
|
|
Domain type |
Enumerated |
||
|
|
|
Determinant Type |
(Not Applicable) |
|
|
|
|
Range Limits |
(Not Applicable) |
|
|
|
Datatype |
Alphanumeric |
||
|
|
Minimum |
4 |
||
|
|
Maximum |
44 |
||
|
|
Format |
A(60) |
||
|
|
Unit of Measure |
(Not Applicable) |
||
|
|
Precision |
(Not Applicable) |
||
|
2. Data
Element Name and Identifier |
|
|||
|
|
Data Element Name Context |
Registry |
Facility Data System |
|
|
|
Data Element Name |
Mailing Address Country Name |
Mailing_Address.Country_Name |
|
|
|
DE Identifier/ Version
Number (DI:VI) |
5394:1 |
||
|
3. Other
Metadata Attributes |
|
|||
|
|
Example |
China |
||
|
|
Origin |
Facility Data System, Environmental Protection
Agency, Office of Enforcement and Compliance Assessment |
||
|
Note/Description |
This data element is required when mail is intended
to be delivered outside the country
of origin. |
|||
|
Submitting organization |
Office of Enforcement and Compliance Assessment |
|||
|
|
Data Steward |
James Jones |
||
|
4. Data
Element Concept (DEC) |
|
|||
|
|
Data Element Concept Name |
Address Country Identifier |
||
|
|
Data Element Concept Definition |
An identifier for a primary geopolitical entity of
the world which indicates an address. |
||
|
|
Conceptual Domain Name |
Countries of the World |
||
|
|
Conceptual Domain Definition |
The primary geopolitical entities of the world. |
||
|
|
Enumerated Value Meaning Text |
The primary geopolitical entity known as <Denmark>. |
||
|
|
VM Begin Date |
19971001 |
||
|
VM End Date |
(Not Applicable) |
|||
|
Classification |
|
|||
|
|
Keyword |
Country |
||
|
|
Group |
Mailing Address |
||
|
|
Representation Class |
Name |
||
|
|
Object |
Address |
||
|
Quality Control |
|
|||
|
|
Registration Status |
Recorded |
||
|
|
Administrative Status |
In quality review |
||
5.5 Register a Group of Data Elements
For some data elements, the
registration authority may determine that is appropriate to group them, out of
some observed relationship among the data elements or a perceived value in
identifying those data elements together.
After the data elements that are to be associated have been identified,
the group itself is registered with the metadata that provides certain
information about the group. The
metadata answers the following questions: How is the group identified? Why has
the group been established? What is the authority for the data elements in the
group? What is the potential use for the group of data elements?
Registering a group of data
elements in a metadata registry requires that certain characteristics of the
group are recorded to clearly describe and define it. The data elements are then associated with the group. The characteristics are stored as attributes
of the group. Attributes specific to a
group, as defined by one RA are:
$ Group Name: The name of a group of data elements.
$ Group Definition: Text that describes the features of, specifies relationships of, or establishes the context for a group of data elements.
$ Authoritative Source: The originating point of information that provides an authoritative reference for a group of data elements.
$ Source Rationale: The text that explains the reasons for using the selected source materials in development of a group of data elements.
$ Potential Usage Comments: The text that describes how a group of data elements can be used.
$ Group Identifier: The system generated identifier for a group of data elements.
Groups of data elements can
be registered in a registry, where a common relationship has been identified
among data elements including the following:
$ Information system architecture, where the data elements make up a logical entity (e.g., mailing address).
$ Data element components, where individual data elements are grouped to make another data element (e.g., urban style street address).
$ Usage, where the elements have a common usage (e.g., data elements in a data standard).
Each of these types of
groups are described in the paragraphs that follow, with a list of the data
elements that have been grouped together. The table which is Exhibit 5.2 illustrates the information
necessary to register the group characteristics for each of the three
groups.
5.5.1 Information System Entity Group
Chemical Substance Identity
is an example of an entity in a system architecture where it is appropriate to
group together data elements that are attributes in that entity. The list of data elements for this entity
are identified as:
C Urban-style Street Address Text. The text that describes the urban‑style street name and number where the mail is delivered
_ Post Office Box Number. The number of the post office box where mail for the addressee is delivered.
_ Mailing Address City Name. The name of the city, town, or village where the mail is delivered.
_ Mailing Address State Code. The alphabetic code assigned by the U.S. Postal Service that represents the state where the mail is delivered.
_ Mailing Address Postal Code. The code that represents the code assigned by a postal service that provides information about the location of a place where mail is delivered.
_ Mailing Address Country Name. The name of the country where a mail piece is delivered. Note: Required only for international mailings.
An example of metadata for
the Mailing Address group is provided in Exhibit 5.1.
5.5.2 Composite Data Element
Composite data elements are
made up of more than one distinct data element that cannot be subdivided
further, and that are maintained in a registry as separate data elements. Urban-style street address is an example of
a composite data element. It contains
the following data elements:
_ Building Number. The number assigned to a building or a land parcel along the street to identify location and to ensure accurate mail delivery.
_ Pre‑Directional Code. The code that represents the direction the street has taken from some arbitrary starting point, and that precedes the street name.
_ Street Name. The name assigned to a street or road, not including other urban‑style street address components.
_ Street Suffix Code. The code that represents the qualifier that follows the street name in a street address.
_ Post‑Directional Code. The code that represents the direction the street has taken from some arbitrary starting point, and that follows the street suffix.
_ Secondary Unit Code. A code that represents the type of secondary unit where mail is delivered, e.g., the code for room, suite, or apartment.
_ Suite Number. The number that represents the specific room, apartment, or other secondary component of an address.
Each
of the data elements in the composite data element group is a distinct data
element that cannot be further subdivided.
The directional codes, street suffix codes, and secondary unit codes all
have enumerated domains that are used to validate portions of the street address. The street address, however, is used as one
item of data on a mail piece, and is, therefore, appropriately registered as an
individual data element.
5.5.3 Use Group
An
example of a group of data elements that are used together, perhaps for
purposes of data
translation
(e.g., the ISO 3166 group of data elements that can be used to translate names
and coded values that identify a country) or for data transfer (e.g., ISO 6709
that specifies formats for transfer of latitude, longitude, and altitude values
that distinguish a geographic point).
Data elements for a Geographic Point Location group, based on ISO 6709,
include the following data elements:
_ Latitude Degrees Measure. The measure in degrees of the angular distance of a position on earth on a meridian north or south of the equator.
_ Longitude Degrees Measure. The measure in degrees of the angular distance of a position on earth on a meridian east or west of the prime meridian.
_ Altitude. The measure of the distance in meters of a position above or below the surface of a reference datum.
_ Latitude Sexagesimal Measure. The sexagesimal measure of the angular distance of a position on earth on a meridian north or south of the equator.
_ Longitude Sexagesimal Measure. The sexagesimal measure of the angular distance of a position on earth on a meridian east or west of the prime meridian.
The
latitude and longitude data elements provide information about the formats and
units of measure that enable translation of the data for data sharing. The rules associated with the standard
provide instructions for grouping the data elements for data sharing (e.g.,
latitude and longitude must be measured by the same unit when grouped together
for data transfer).
The
following table contains examples of the metadata that should be captured about
a group of data elements when the group is registered.
|
|
Information System Entity Group |
Composite Data Element Group |
Usage Group |
|
Group Name |
Mailing Address |
Urban-style Street Address |
Geographic Point Locations |
|
Definition |
A set of data elements that can be used to create a mailpiece. |
A set of precise and complete data elements that cannot be subdivided and that can be combined into an urban‑style street address. |
The horizontal and vertical coordinates and associated metadata that define a point on earth. |
|
Group Source |
U.S. Postal Service, Publication 28: Postal Address Standards |
U.S. Postal Service, Publication 28: Postal Address Standards |
International Standard ISO 6709 |
|
Source Rationale |
The U.S. Postal Service is the nationally recognized authority for defining the requirements for creating a mailpiece, in addition to being responsible for most mail delivery within the U.S. |
The U.S. Postal Service is the nationally recognized authority for defining the requirements for creating a mailpiece, and for maintaining standards and domains for formatting street address information. |
ISO data standards are used internationally for consistent representation of data that enables data sharing. The standard also provides rules for formatting spatial data transfer files. |
|
Usage Comments |
System developers will use the Mailing Address group when creating an entity for mailing address. |
The Street Address group is used to parse the components of an urban‑style street address into individual segments for validation and to facilitate searching. |
The
geographic point locations group is used by system developers to develop a
system entity for spatial data, to develop translation software, and data
transfer files. |
Exhibit 5.1. Metadata for Groups
5.6 Linking of Data Elements
Data elements can be linked
based on their levels of abstraction.
Linkages can occur in both vertical relationships and horizontal
relationships, defined as follows:
$ Vertical relationships are those where a data element that has been qualified for a particular purpose is related to a more generic data element that is not qualified and is intended for a more general purpose. For example, the following data elements can be linked vertically in parent/child relationships, where 1 is the highest, and the vertical linkages increment by 1:
1 State USPS Code: The U.S. Postal Service abbreviation that
represents a state or state equivalent for the U.S. (DI:VI 48:1)
2 Mailing Address State Code: The alphabetic code assigned by
the U.S. Postal Service that represents the state where the mail is
delivered. (DI:VI 5408:1)
3 Facility Mailing Address State Code: The code that represents a state of the
United States in the mailing address for a facility. (DI:VI 5680:1)
$ Horizontal relationships are those where data elements with different names have equivalent definitions that represent equivalent/equal data domains. For example, the following data elements can be linked horizontally as equivalents in Envirofacts, a data warehouse of EPA environmental systems.
The third level: Facility
Mailing Address State Code (DI:VI 5680:1) is linked horizontally to:
3a PCS_PERMIT_FACILITY.MAILING_STATE The state in the primary facility mailing address. (DI:VI
24684:1)
3b BRS_SITE_INFORMATION.MAIL_STATE The two-character state postal code for the site's mailing
address. (DI:VI 23984:1)
3c RCR_MAILING_LOCATION.STATE
The two-letter postal code for the state in the address associated with
the facility mailing address. (DI:VI 24528:1)
5.7 Registration of Associated Sources/Documents
Talk
about documents, citations, classified components. There are at least 5 or 6 accejpted standards for citation.
Need to register forms that have about 30 to
40 data elements (items). Should
include a graphic (picture) of forms.
6 Complex Data
Many
organizations produce data for internal or external use. As a result, information that describes that
data (metadata) must be readily available.
With the advent of electronic access to data through the Internet and
other media, the metadata must be accessible electronically, too. Registries are deployed to manage and
organize the metadata, and standards such as ISO/IEC 11179 address the content
and basic functions of those registries.
Organizations
around the world are implementing registries based on the framework described
in ISO/IEC 11179 and the metamodel defined in ANS X3.285. However, the framework has limitations that
constrain the usefulness of the registries.
The proposed modifications to ANS X3.285 will remedy some of these
limitations.
ISO/IEC
11179 addresses the specification and standardization of data elements. The metadata that is specified in the
standard describes data elements at the fundamental level. Organizations that produce and use data
generate new data elements from existing ones, and the standard does not
address this issue. Also, object
oriented technology, multimedia applications, and advanced scientific applications
produce very complex data types that are not described very well by the
standard.
Some
data elements are generated from other existing ones in many ways. Mathematical calculations (e.g. variance
estimations), aggregation (e.g. multivariate cross tabulation), concatenation
(e.g. formation of telephone number from its constituent parts), or grouping
(e.g. address) are typical examples.
Metadata registries that contain the descriptions of how data elements
are generated from others will help users to understand the data more fully.
Even
the fundamental data elements of an organization, ones that are not generated
from others in the sense described above, can be generated. The functions of the business themselves can
generate data elements. Identifying
these functions, especially within the context of the organization, will help
users increase their understanding of data.
At
this point in time, the only identified types of complex data are derived data
and data groups. These are defined as
• Derived Data Element - A data element
whose values are derived through a transformation of the values of one or more
other data elements. This
transformation may be mathematical, logical, linkages, or some other type
(including a combination of these basic types).
• Data Group - A set of data elements
considered as a logical unit.
An
important point about data groups is that they are equivalent to abstract
derived data elements, where an abstract data element is a data element that is
not part of a particular application.
This view means that data groups don't need to be treated separately.
These
minor changes to ANS X3.285 will improve the handling of complex data items:
•
Have the Rule entity account for the transformation formula.
•
Put an attribute on the relationship between Data Element and Rule,
called role, to distinguish data elements which are input to a
transformation and the data elements which are output from a transformation
(derived).
• A
lookup table entity, such as Derivation Type, is needed to keep track of
the type of transformation used.
• A
recursive or hierarchical relationship on Rule is necessary to account
for combinations of transformations.
Annex A
Bibliography
[1] ISO 1087:1990 Terminology - Vocabulary.
[2] ISO/DIS 1087-1 Terminology - Vocabulary - Part 1: Theory and application (Partial revision of ISO 1087:1990).
[3] ISO/DIS 1087-2 Terminology - Vocabulary - Part 2: Computer applications (Partial revision of ISO 1087:1990).
[4] ISO/IEC 2382:1979-1998 Parts 1-32 Information technology - vocabulary.
[5] ISO 2788:1986 Documentation - Guidelines for the establishment of monolingual thesauri.
[6] ISO 3166-1:1997 Codes for the representation of names of countries and their sub-divisions.
[7] ISO 5964:1985 Documentation - Guidelines for the establishment of multilingual thesauri.
[8] ISO 6709, 1983-05-15 Standard representation of latitude, longitude and altitude for geographic point locations.
[9] ISO/IEC 7826-1:1994 Information technology - General structure for the interchange of code values - Part 1: Identification of coding schemes.
[10] ISO/IEC 7826-2:1994 Information technology - General structure for the interchange of code values - Part 2: Registration of coding schemes.
[11]
ISO/IEC 11404:1996 Information technology BProgramming
languages, their environments and system software interfaces BLanguage-independent
datatypes.
[12]
SC32 N0147 Horizontal Issues and Encodable Value Domains in Electronic
Commerce.
[13]
ANSI X3.61B1986 Representation of Geographic Point
Locations for Information Interchange.
[14]
Firesmith, Donald G., Object-Oriented
Requirements Analysis and Logical Design, John Wiley and Sons, New York,
1993.
[15]
Senehi, M.K., and Thomas R. Kramer, "A Framework for
Control Architectures," International Journal of Computer Integrated
Manufacturing, Vol. 11, No. 4, July-August, 1998, pp. 347-363.
[16]
Zachman,
John A., The Framework for Enterprise
Architecture: Background, Description and Utility, 1997, http://www.ozemail.com.au/~visible/papers/zachman3.htm
Annex B
Definitions of representation class terms
C Amount - the sum total of two or more quantities; an aggregate.
C Code - a symbol used to represent something.
$ Discriminator - A distinction that differentiates one from another.
C Graphic - diagrams, graphs, mathematical curves, or the like.
· Identifier - Something that represents to be, regards, or treats as the same or identical.
$ Indicator - Anything that serves to point out or direct attention to, as of a measuring device.
$ Label - A short word or phrase descriptive of a person, group, or intellectual movement, or indicating that what follows belongs in a particular category or classification.
C Measure - the extent, dimensions, quantity, etc. of something ascertained by comparison with a standard.
C Name - a word or combination of words by which a person, place, object, or thought is known.
C Number - a numeral or group of numerals.
C Picture - a visual representation of a person, object, or scene.
C Quantity - the property of magnitude of something.
C Text - a unit of connected speech or writing often composed of one or more sentences that form a cohesive whole.
$ Tag - A descriptive word or phrase applied to a person, group, organization, etc. as a label or means of identification or epithet.
Annex C
Principles of
managing shared data
These
principles were used while developing the metamodel. Each principle is directly or indirectly supported by the
metamodel. Conversely, this is an
itemized description of much of the conceptual data structure depicted in the
data model. It includes many of the
more significant:
• Fundamental principles and “business
rules” for data registration.
• Definitions that are applicable within
the scope of this standard.
• Constraints and integrity rules for the
data used for data registration.
• Structural relationships and
cardinalities among data element components.
• References to terminology used
elsewhere.
• Objectives for good information
management.
These
principles are fundamental to the use of a conformant implementation this
metamodel. If the user deviates from
any principle, to resulting data registry may not realize expectations.
C.1 Data
C.1.1 Data is a representation of a fact, idea,
or instruction in a formalized manner suitable for communication, interpretation,
or processing by humans or by machines. (This definition refers to a group
taken as a unit thus it is used with a singular verb.)
C.1.2 Data must be able to be created, collected,
organized, recorded, processed, and stored in a medium in a retrievable form.
C.1.3 Data represents data element concepts
(i.e., the properties of object classes) by using a set of symbols that are
perceived. These may be words made up
of characters, icons, sounds, Braille, etc.
C.1.4 Data allows us to consider an object that
exists in the real world without having the actual object present. In other words, data provides an abstraction
of the real world object.
C.1.5 Data that is derived should be registered
the same as any other data if it is stored.
C.1.6 An instantiation of a single element of
data is called a “data item” (a.k.a. datum).
C.1.7 A single type or class of structured data
treated as a cohesive whole is called a “data unit”.
C.1.8 A single unit of data that is considered
indivisible within a universe of discourse is called a "data
element". It is identical to what
some others call a "simple data element".
C.1.9 Data used to describe the meaning or
characteristics of data is called "metadata".
C.2 Concept
C.2.1 A concept is a unit of thought (an idea)
constituted by the abstraction of the common characteristics of a set of
objects.
C.2.2 An object may be any person, place, event,
or other thing that has separate and distinct existence in the real world.
C.2.3 Each concept can be shown as a more
specialized type, or a component part, of one or more higher-ordered concepts.
C.2.4 A concept inherits characteristics from one
or more generalized supertype(s).
C.3 Object class
C.3.1 Humans tend to group objects when they have
similar traits. When we group a set of
similar things, we refer to it as a "type" or "class". A single category of "things" or
"objects" is called an "object class".
C.3.2 An object class is a set of concepts,
abstractions, or things in the natural world that can be identified with
explicit boundaries and meaning and whose properties and behavior all follow
the same rules.
C.3.3 Object classes may be a single concept or a
set of concepts in a relationship with each other to form a more complex
concept. Concepts in relationship with
other concepts are sometimes called "concept systems".
C.3.4 Data is a representation of properties of
object classes.
C.3.5 An object class is the same as an entity
(entity type) or relationship in the relational paradigm.
C.3.6 It is desirable to describe object classes
without redundancy within the universe of discourse. The same object class, but with different names and/or wording of
definitions, should eventually be normalized.
C.4 Property
C.4.1 A property is a classification of any
feature that humans naturally use to distinguish one individual object from
another.
C.4.2 When we describe an object, we describe its
properties. If we know nothing about
the kind of properties an object has, we are not aware of the object.
C.4.3 A property class refers to the conceptual
part of an attribute, i.e., without representation.
C.4.4 A property class has no particular
associated means of representation by which it can be communicated.
C.4.5 A property class may be associated with
more than one object class where it describes a conceptual attribute (one
without representation).
C.4.6 A property class is a concept playing the
role of a property class in a data element concept. Only certain concepts have the ability to behave as a property
class. Whether one of these concepts is
acting as a property class cannot be determined until it is associated with an
object class in a data element concept.
C.4.7 It is desirable to describe properties
without redundancy within the universe of discourse. The same property class, but with different names and/or wording
of definitions, should eventually be normalized (a.k.a. harmonized or rationalized).
C.4.8 Properties are sometimes called
"characteristics".
C.5 Data element concept
C.5.1 A data element concept is the union of two
or more concepts with one concept playing the role of a property.
C.5.2 A data element concept is the human
perception of a single property of an object class, identified and described
independently of any particular representation.
C.5.3 A data element concept has a definition
different from its object class or property.
C.5.4 While any specifically defined data element
concept may have several representations in a universe of discourse, each such
data element concept should have a preferred data element representation in a
data registry.
C.5.5 If an object class and a property are
normalized across the universe of discourse, the data element concept will also
be normalized.
C.5.6 Since the object class and the property
have no representation, the data element concept will have no representation.
C.5.7 A data element concept may be represented
as a data element.
C.5.8 Data element concepts are sometimes called
"Basic Semantic Units".
C.6 Attribute
C.6.1 An attribute is a characteristic of an
object class that the business chooses to record as data.
C.6.2 An attribute is always associated with only
one object class.
C.6.3
An attribute is complex. It is composed of both a property and a
representation. The concept of an
attribute is separate from how it is represented.
C.6.4 When a characteristic of a data unit is
being described, the attribute is called a "meta-attribute".
C.6.5 The metadata used to describe data units
requires many meta-attributes. A set of
meta-attributes of data units bundled together as a module for reusability is
called a "metadata set".
C.7 Representation
C.7.1 Before a data element concept can become a
data unit it must be expressed as a term, character, symbol, et cetera that
represents a meaning of the property class.
Such a notation is called "representation".
C.7.2 Representation describes how a data element
concept appears in a persistent store, on a screen, on paper, et cetera. Representations are human-interpretable
(sound, tactile, visual).
C.8 Data element representation
C.8.1 A data element representation is the part
of a data element having a value domain, datatype, and, if a quantity, a unit
of quantity.
C.8.2 A set of similar data element
representations (i.e., a "type" or "class") are grouped as
a representation class for classification purposes.
C.8.3 A data element representation may be
associated with one or more data element concepts.
C.8.4 The permissible values of a value domain
may be expressed by specifying the range from its lower to upper limit, by a
rule, by a procedure or scheme, or by enumeration in a finite list.
C.8.5 A data element representation may have a
"compound datatype" that separates the representation into
constituent parts. A compound datatype
would only be plausible where the data element representation could be used as the
representation of a single data element concept.
C.8.6 A value domain may be an aggregation of a
set of smaller value domains.
C.8.7 It is desirable to describe data element
representations without redundancy within the universe of discourse. Data element representations with the
identical value domain, datatype, and, if a quantity, a unit of quantity,
should eventually be normalized.
C.9 Data element
C.9.1 A data element is a single unit of data
that is considered indivisible in its shareable universe.
C.9.2 A data element cannot be decomposed into
more fundamental constituent parts of information that have useful meanings
within its shareable universe.
C.9.3 A data element is an electronic or written
representation of a data element concept.
C.9.4 Data elements are the basic building blocks
of data.
C.9.5 A data element is the association of a data
element concept with a data element representation.
C.9.6 There may be more than one alternate way a
data element concept is represented as a data element by associating it with different
data element representations.
C.9.7 A data element concept associated with two
or more data element representations are different data elements.
C.9.8 The term "data element" refers to
a type or class (i.e., the complete set of instances) and not any particular
instantiation of a value for a data element.
Where a specific data element specimen occurs, it is called a "data
element instance".
C.9.9 Each data element will represent no more
than a single data element concept.
C.9.10 A data element is identical to an attribute
in many data modeling paradigms. In a logical data model, a data element is
often considered an attribute.
C.9.11 Data elements are individual, discontinuous
or discrete pieces of information. They
are not defined in analog or digital flows as used in electronically
transmitted audio or video.
C.9.12 Data elements can be "persistent
data" or "transient data" — data that is created and consumed
without ever being stored in a database.
C.9.13 A data element is described independent of
the physical space in which it is stored or transmitted. A single physical space (e.g., a field or
column in a database) may be reused for more than one data element.
C.9.14 If a data element concept and a data element
representation are normalized across the universe of discourse, the data
element will also be normalized.
C.9.15 Each data element should have one
identifier, one definition, one representation, one data steward, and one
common set of business rules governing that element throughout the enterprise.
C.9.16 A data element is associated with a specific
set of values. Any value can be
expressed by a set of symbols.
C.9.17 A data element always takes on a value from
a set of allowed data values. If it
cannot be associated with a set of distinct values, it is not a data
element. These values can include
written characters, sounds, or images.
C.10 Enumerated domain
C.10.1 Each value in an enumerated domain
represents an abstraction of an object in the real world.
C.10.2 The collection of the object concepts in an
enumerated domain is called a "conceptual domain". It is composed of a set of all permissible
value meanings without a specified representation.
C.10.3 Once a data element concept is associated
with a data element representation with an enumerated domain, a value meaning
must be associated with each permissible value in the set.
C.10.4 Each value meaning in a conceptual domain
may be associated with a permissible value member of more than one enumerated
domain representations.
C.11 Identifier
C.11.1 Each data element, object class, object
class, property, data element concept, conceptual domain, value domain, and
representation class will be uniquely identified by its identifier within a
Registration Authority.
C.11.2 Identifiers will carry no intelligence.
C.12 Name
C.12.1 A name will not be used as an identifier.
C.12.2 Various names for various contexts where the
names are used and have meaning are important metadata.
C.12.3 Classification names can be constructed from
the various name meta-attributes associated with object classes and
representation classes.
C.13 Quality
C.13.1 Data elements have several levels of
quality.
C.13.2 All data used in the enterprise should be
recognized, regardless of quality.
C.14 Registration Authority
C.14.1 A Registration Authority is self-nominated.
C.14.2 A Registration Authority obtains a
registration authority identifier.
C.14.3 A Registration Authority manages a data
registry.
C.14.4 Each Registration Authority establishes the
datatype categories used in its data registry.
C.14.5 Each Registration Authority establishes the
procedures used to register data.
C.14.6 A Registration Authority may have an
organization or individual within acting as a registrar.
C.15 Data Registry
C.15.1 A data registry is a structure to store data
about data that may be shared among Information Systems and/or organizations.
C.15.2 A data registry does not include data about
Information Systems.
C.15.3 A data registry does not include data about
the (conceptual, logical, or physical) structure of databases.
C.15.4 A data registry will be administered by a
Registration Authority who acts as a resource to the registry's clients for
establishing metadata about registered data and their applications.
C.15.5 A data registry is a place to keep
characteristics of classes of objects that exist in the real world that the
business chooses to record as data.
C.15.6 A data registry provides a centralized
directory to describe the meaning, representation, and identification of units
of data and their values.
C.15.7 A data registry enables data to be well
described so that users know exactly what facts are represented.
C.15.8 A data registry supports data sharing with
cross-system and cross-organization descriptions of common data.
C.15.9 A data registry is a database with
appropriate analysis and user interface software.
C.15.10
A data registry may be a stand-alone system, or may be part of an Information
Resource Dictionary System (IRDS) or any other information repository.
C.15.11
A data registry assists in preventing redundancy of registering the same data
(described by a metadata set) multiple times within the same registry.
C.15.12
A data registry assists in preventing unplanned redundancy of the same business
fact in different data elements.
C.15.13
A data registry promotes reusability of data descriptions. Metadata in a data registry should be
structured as modules to maximize the reusability of these metadata sets.
C.15.14
The structure of the data registry is purposely contrived to avoid the common
confusion between multiple-element units of data and single elements of data.
C.15.15
Descriptions of shareable data must be conveniently and immediately accessible
to all users.
C.15.16
Registered data will be organized for easy accessibility.
C.15.17
Each data element will be classified by the object class for which it
represents a property.
C.15.18
A data registry that is available to all interested parties facilitates
harmonization and interchange among the parties.
C.15.19
A data registry incorporates all of the fundamental principles itemized above.
C.15.20
A data registry is sometimes called a "register".
Annex D
Data registry
uses and users
Data
users can share data if they use a common database. However, users often wish to exchange data across organizations
and systems without incurring the delay and cost of creating a communal
database. A more practical way of
sharing data is to create a catalog of descriptions of shareable data. The catalog contains descriptions of the
type of data we have reason to share with others. It does not contain any information about instances of data. It describes types of data including their
allowed values. This data describing
shareable data is what we call metadata.
With
this approach, the key to sharing data is thus to share and reuse
metadata. We can put this metadata in a
catalog that is organized in a way that all stakeholders can use it. Users can have direct access to items in the
catalog with convenient retrieval procedures.
When
we catalog all the data used in an enterprise, we are confronted with several
ways to represent the same “fact”.
Information in the catalog can be organized to assist data
administrators to identify redundancy.
Data administrators can use the metadata catalog to standardize
preferable data descriptions. By
labeling well-described and sanctioned units of data in the catalog, other
users will know which form of data representation to use.
Software
engineers can view descriptions of data that others have already documented in
the catalog. If software engineers find
it easy to copy from others, they promote shareable data. The efficient software engineer can simply
use what other analysts had already done.
Not only will they make data shareable, their task will be easier. Also, ultimately their clients will likely
be happier since this will reduce software development time. It also increases
the quality of the information system product.
Electronic
data interchange (EDI) data element designers' needs are similar to those of
software engineers. They know what
types of information trading partners need to share, but they need to describe
it as data elements. If it exists in a
catalog, they can use it. If it does not exist, they describe a new data
element and put it into the catalog.
End
users have trouble finding the data that interests them. They often do not know its definition, what
it is called, the possible values, what the values mean, et cetera. The catalog can give them the information
they need. Of course, the structure of
the metadata must allow them to find what they are looking for. That is also
true for the other users.
Originally,
in its most rudimentary form, we called this catalog a data dictionary. More recently it has expanded to become the
data encyclopedia. The even more
comprehensive data repository or information repository came next. In the form
described in this document, the directory is a data registry. The data registry is only a sub-set of the
complete metadata that can be included in a data or information
repository. However, that metadata
sub-set is structured in a way that supports administration and retrieval of
registered data. A data registry is
definitely more than just another data dictionary.
A
data registry facilitates sharing data without requiring that all users obtain
this data from a single communal database.
Data can be shared among disparate databases and users.
Annex E
Conceptual and
logical data models
A
conceptual data model describes how relevant information is structured in the
natural world. This has (somewhat
inaccurately or cryptically) been called the "model of the business"
(it is not always a business) or "enterprise model" (the term
enterprise has several common uses).
The conceptual data model provides an excellent place to start modeling
data within universe of discourse. It
is also the most viable level at which to integrate different data models.
A
conceptual data model can be used to develop a more specific logical data model
of the identical universe of discourse.
A
logical data model describes the same data as structured in an information
system. It is often and accurately
referred to as a "model of the information system". A logical data model can be directly used
for database design. This is the level
where most software engineers start.
This often hinders the identification of the basic concepts to be
represented by the data. It also makes
correct integration of data models significantly more difficult.
A
conceptual data model is converted to a logical data model with several
translations, additions, and decisions. Generally these:
• Add any control and interface objects
or entities.
• Eliminate or resolve many-to-many
relationships.
• Combine entities with one-to-one
relationships.
• Identify key attributes.
• Decide which entities from related
entities can become attributes based upon the intended use and importance of
the data.
• Specify which entities will inherit
foreign keys.
• Specify representation class, datatype,
character count, and other value domain metadata attributes that describe the
data elements used to represent the data element concepts described in the
conceptual model.
• Convert all special relationships such
as subtypes, components, and dependencies into conventional relationships.
• Specify whether each attribute is
mandatory, conditional, or optional.
Annex F
Table of Data Element
Attributes for Examples
(Informative)
Annex F contains a table
that includes the data element attributes for the examples provided earlier in
this document. The table provides
examples of the metadata associated with three data elements from the ISO 3166
standard (i.e., Country Short Name, Country Long Name, and Country Numeric
Code), an illustrative application data element, and two data elements from the
ISO 6709 standard. The data element
attributes are given in the first column and the illustrative data that could
be registered for each of the example data elements is given in subsequent
columns.
Table of Data Element
Attributes for Examples (Informative)
|
Data Element Meta-- Example model
Attribute Name |
ISO 3166 Enumerated, Name |
ISO 6709 Non-enumerated, Latitude |
Application Enumerated, (System Reference) |
|||
|
1. Data
Element Definition and Permissible Values |
|
|
||||
|
|
Data Element Definition Context |
Registry |
Registry |
Registry |
Facility Data System |
|
|
|
Data Element Definition |
The English-language short name of a country. |
The measure in degrees of the angular distance of a
position on earth on a meridian north or south of the equator. |
The name of the country where a mail piece is
delivered. |
The name of a country where the addressee is located. |
|
|
|
Permissible Values |
All English-Language Short Country Names from ISO
3166, matched with value meanings.
(Afghanistan, Albania,......, Zimbabwe) |
Measures of Latitude in Degrees, Minutes, and Seconds |
All English-Language Short Country Names from ISO
3166, matched with value meanings.
(Afghanistan, Albania,......, Zimbabwe) |
||
|
|
PV Begin Date |
19971001 |
(Not Applicable) |
19971001 |
||
|
|
PV End Date |
(Not Applicable) |
(Not Applicable) |
(Not Applicable) |
||
|
|
Value Domain Definition |
All English-language short names of all countries. |
All measures of the distance of an angle north or
south of the equator measured in degrees, minutes, and seconds. |
All English-language short names of all countries. |
||
|
|
Character Set |
English language |
English language |
English language |
||
|
|
Domain type |
Enumerated |
Non-enumerated |
Enumerated |
||
|
|
Determinant Type |
(Not Applicable) |
Range |
(Not Applicable) |
||
|
Range Limits |
(Not Applicable) |
00-90 for degrees |
(Not Applicable) |
|||
|
|
Datatype |
Alphanumeric |
Alphanumeric |
Alphanumeric |
||
|
|
Minimum |
4 |
7 |
4 |
||
|
|
Maximum |
44 |
13 |
44 |
||
|
|
Format |
A(60) |
A(13) +/-DDMMSS.SSSSS |
A(60) |
||
|
|
Unit of Measure |
(Not Applicable) |
Sexagesimal |
(Not Applicable) |
||
|
|
Precision |
(Not Applicable) |
Number of decimal places recorded. |
(Not Applicable) |
||
|
2. Data
Element Name and Identifier |
|
|
||||
|
|
Data Element Name Context |
Registry |
Registry |
Registry |
Facility Data System |
|
|
Data Element Name |
Short English-Language Country Name |
Latitude Sexagesimal Measure |
Mailing Address Country Name |
Mailing_Address.Country_Name |
||
|
|
DE Identifier/ Version
Number (DI:VI) |
20903:1 |
312345:1 |
5394:1 |
||
|
3. Other
Metadata Attributes |
|
|
||||
|
|
Example |
China |
+674532 and +674531.85435 |
China |
||
|
Origin |
ISO 3166-1:1997, Codes for the representation of
names of countries and their subdivisions B Part 1: Country codes (Document) |
ISO 6709-1983 (E), Standard representation of
latitude, longitude and altitude for geographic point locations. |
Facility Data System, Environmental Protection
Agency, Office of Enforcement and Compliance Assessment |
|||
|
Note/Description |
This data element is included in the EPA revised
interim Facility Identification Standard. |
Latitude sexagesimal converts to latitude degrees by
the following formula: seconds x 60 = decimal minutes, total minutes x 60 =
decimal degrees. |
This data element is required when mail is intended
to be delivered outside the country
of origin. |
|||
|
Submitting organization |
Office of Information Resources Management |
Office of Information Resources Management |
Office of Enforcement and Compliance Assessment |
|||
|
|
Data Steward |
Marian Cody |
Larry Fitzwater |
James Jones |
||
|
4. Data
Element Concept (DEC) |
|
|
||||
|
|
Data Element Concept Name |
Country Identifier |
Latitude Distance |
Address Country Identifier |
||
|
|
Data Element Concept Definition |
An identifier for a primary geopolitical entity of
the world. |
A measure of the angular distance of a point on the
surface of the earth north or south of the equator |
An identifier for an address of a primary
geopolitical entity of the world. |
||
|
|
Conceptual Domain Name |
Countries of the World |
Latitude Coordinates |
Countries of the World |
||
|
|
Conceptual Domain Definition |
The primary geopolitical entities of the world. |
The coordinates that indicate the distance north or
south of the equator for locations. |
The primary geopolitical entities of the world. |
||
|
|
Enumerated Value Meaning Text |
The primary geopolitical entity known as <Denmark>. |
(Not Applicable) |
The primary geopolitical entity known as <Denmark>. |
||
|
VM Begin Date |
19971001 |
(Not Applicable) |
19971001 |
|||
|
VM End Date |
(Not Applicable) |
(Not Applicable) |
(Not Applicable) |
|||
|
Classification |
|
|
|
|||
|
|
Keyword |
Country |
Horizontal Coordinate, Latitude |
Country |
||
|
|
Group |
Country Identifiers, Geopolitical Entities |
Geographic Point Locations |
Mailing Address |
||
|
|
Representation Class |
Name |
Measure |
Name |
||
|
|
Object |
Country |
Latitude |
Address |
||
|
Quality Control |
|
|
||||
|
|
Registration Status |
Standard |
Certified |
Recorded |
||
|
|
Administrative Status |
Final |
No Further Action |
In quality review |
||
Annex G
Top down approach to data element
registration
A small amount of data that are added to a registry comes in groups or classifications (e.g., Chemical Substances or Biological Taxonomy), rather than as individual data elements. When a classified group of data elements is to be added to the registry, the analyst might choose to identify the conceptual domains that are relevant to the group, consider their value meanings, and work down to data elements. For the purpose of this informative annex, the group Biological Taxonomy will be used as the example.
More than one conceptual domain might be identified at the start. Names and definitions for these might include:
1) Biological OrganismsCAll life forms considered as entities.
2) Biological Organism TypesCAll ways of typing biological organisms.
G.1 Biological
Organisms
Starting with the first conceptual domain, Biological Organisms, we must envision the value meanings that would be appropriate for Biological Organisms. Just as the value meanings for Countries of the World are "The principal geopolitical entity of the world known as ...." where the entity might be France, Germany, Canada, or any of the countries of the world, the value meaning of Biological Organisms would be "The biological organism known as ...."
An essential difference between the two conceptual domains is that we know the names of the "Countries of the World." We do not, however, intend to enumerate all of the life forms that are known. The value meanings for Biological Organisms will not be identified and listed, but will be determined from references. Therefore, only non-enumerated domains will be associated with this conceptual domain.
G.1.1 Data Element Concepts
One data element concept that would be associated with Biological Organisms would be "Biological Organism Label," where "Biological Organism" would be the object, and "Label" the property. Note: Label is defined as a short word indicating that what follows belongs in a particular category or classification (see 5.1.6). The definition of this data element concept would be "A label that identifies a biological organism."
G.1.2 Data Elements
Data elements to be associated with the "Biological Organism Label" would be all of the names, codes, and identification numbers associated with biological organisms, including:
C Biological Organism Taxonomic NameCThe systematic name that provides a definitive classification for a biological organism.
C Biological Organism Vernacular NameCThe common name that is associated with a biological organism.
C ITIS Taxonomic Serial NumberCThe unique number assigned to a biological organism by the Integrated Taxonomic Information System (ITIS)[2].
C Biological Identification NumberCThe unique number assigned to a biological organism by the Biological Registry System.
G.1.3 Permissible Values
Permissible values for these data elements would not be enumerated, as described above in Section G.1. The permissible values, however, will all be names, numbers, and codes that represent an implied value meaning of "The biological organism known as...".
G.2 Biological
Organism Types
Biological information can be separated into several categories or types of related entities. Types of biological organisms can be limited for a particular application, and can be expected to have value meanings associated with them. The selection of the types to be included and the definition of each grouping could be based on widely accepted criteria or useful only for a specific application. For example, the types of biological organisms in this sample scheme could include:
C BiotaCAn animal, plant, fungus, or other biological organism of a region or period.
C VirusCAn ultramicroscopic agent that replicates only within the cells of living hosts, which are mainly bacteria, plants, and animals.
C GroupCA collection of biological organisms that are related in some way.
Note: The selection of these types, for this example, is based on the fact that ITIS currently does not contain information on viruses and groups. ITIS Taxonomic Serial Numbers would be available only for each biota. Virus identification would come from The Universal Virus Database (http://life.anu.edu.au/viruses/welcome.htm). Groups would include such things as macro-invertebrates, minnows, and coliform that are counted and recorded as aggregates in environmental studies. Although ITIS currently does not contain identification for groups of organisms, it might store information about the individual organisms that are members of a group.
G.2.1 Data Element Concepts
A data element concept associated with the conceptual domain "Biological Organism Types," might be "Biological Organism Type," where Biological Organism is the Object, and Type is the property. Note: It is not always necessary to include the word Label in a Data element concept name. The definition of the data element concept might be "A type of a biological organism."
G.2.2 Data Elements
Data elements associated with this data element concept might be:
C Biological Organism Type NameCThe name of the type of a biological organism.
C Biological Organism Type CodeCThe code that represents a type of biological organism.
G.2.3 Permissible Values
Permissible values for the "Name" representation would be the same names as the value meaning names, and the "Code" representation would be some kind of number or character used to represent the Type.
G.3 Top Down Population of a Registry
The information that is included in a registry would be the same as that shown in Annex F, but the order of population would be different. The following is a reordering of the first column of Annex F to illustrate the top down approach to registry population.
Conceptual Domain (CD) Name
Conceptual Domain Definition
CD ID
Value Meanings
VM Begin Date
VM End Date
VM ID
Data Element Concept (DEC) Name
Data Element Concept Definition
DEC ID
Representation
Value Domain (VD)
VD ID
Domain Type
Determinant type
Range limits
Datatype
Format
Minimum
Maximum
Unit of Measure
Precision
Data Element Name Context
Data Element Definition
Data Element Name
DI:VI
Permissible Values
PV Begin Date
PV End Date
Example
Origin
Note/Description
ANNEX Y
Business Rules for
Populating a Metadata Registry
ANNEX Y
This annex includes
information on how to record particular metadata attributes, in more detail
than described in Section 5.1 of this technical report.
Y.1 Data Element Definition
The purpose of a data
element definition is to define a data element with words or phrases that
describe, explain, or make definite and clear its meaning. Precise and unambiguous data element
definitions are one of the most critical aspects of ensuring data shareability. The value domain, described in Section Y.2,
identifies the complete set of values that can be contained in a data
element. Each data value in a domain
must conform to the definition for that data element.
ISO/IEC 11179-4 provides the
standard for formulating data element definitions. There are mandatory rules, to which all data element definitions
must comply, and there are guidelines which should be followed in formulating a
definition. The standard does not
specify syntactical requirements (i.e., word order and structure), which may be
established by the registration authority.
A registration authority might choose to allow multiple definitions, in
context, for a data element in the same manner that multiple names, in context,
are allowed. In the case of multiple
definitions, each definition must convey the same, exact meaning so that there
is no ambiguity to the values for that data element. See Section 5.4.1.2 for examples of names and definitions in
context.
The rules and guidelines
applicable to the Registry Definition (i.e., the unique definition that has
been assigned to the data element for registration in a metadata registry)
follow. A syntax that has been adopted
by one registration authority is also included in this section.
Y.1.1 Mandatory Rules
Rules for formulating a data
element definition are mandatory and testable for compliance. The following rules must be followed when
formulating a data element definition:
$ Unique (within any data dictionary in which it appears).
$ Singular.
$ State what the concept is, not only what it is not (i.e., never exclusively in the negative).
$
Descriptive
phrase or sentence.
$ Contain only commonly used abbreviations.
$ Does not contain embedded definitions of other data elements or concepts.
Examples of definitions that
meet the above requirements are described in the following paragraphs.
Y.1.1.1 Uniqueness
According to the standard
rules for formulating data definitions, a data definition shall be unique
within any data registry and registration authority in which it appears. Each definition shall be distinguishable
from every other definition within a registration authority to ensure that
specificity is maintained. One or more
characteristics expressed in the definition must differentiate its concept from
other concepts.
Note that a registration
authority that registers incomplete application data elements might contain
several data elements with the same definition, each within the context of the
source of that data element. These data
elements should be linked to the appropriate well-formulated data elements that
contain the same data values. See
Section 5.6.5 for linking of data elements.
Good: Regulation Effective Date: The calendar date when a regulation
became effective.
Sample Collection Start
Date: The calendar date when collection of the sample began.
Poor: Regulation Effective Date: The date when the event started.
Sample Collection Start
Date: The date when the event started.
Y.1.1.2 Singular
The concept expressed by the
data definition shall be expressed in the singular.
Good: The commonly known, short name of a country.
Poor: The commonly known, short name of countries.
Note: The poor definition implies that a name
might identify more than one country.
Y.1.1.3 State the Concept; Not Only its Negative
A definition cannot be
constructed exclusively by saying what the concept is not. The following are definitions of
"Country Name" demonstrate good and bad definitions.
Good: The commonly known, short name of a country.
Poor: The name that is not the long name of a country.
Note: In some instances, a good
definition that specifies what the concept is, might also specify what the
concept is not, as in the following example:
Good: The commonly known, short
name of a country that is not its long name.
Y.1.1.4 Descriptive Phrase or
Sentence
A phrase or sentence is
necessary to describe the essential characteristics of the concept. Stating the name as a synonym, or restating
it with the same words is insufficient.
Good: The
commonly known, short name that identifies a country.
Poor: Name of a country.
Note: The poor definition does not describe the concept that this is
the short name, not an expanded or long name.
Y.1.1.5 Contain Only Commonly Used Abbreviations
Understanding the meaning of
an abbreviation, including acronyms and initials, is usually confined to a
certain environment. In other
environments the same abbreviation can cause misinterpretation or
confusion. An exception to this rule
can be made if an abbreviation is more readily understood than the full form
and has been adopted as a term in its own right, such as email (i.e.,
electronic mail), radar (i.e., radio detecting and ranging) and fax (i.e., facsimile). When an abbreviation or an acronym is
included in a definition, it should follow the full term and be enclosed in
parentheses.
Example 1:
Good: The code that represents the economic activity of a company as
specified by the Standard Industrial Classification (SIC) of Establishments.
Poor: The SIC code for a company.
Example 2:
Good: The code that represents the unit for measuring the mass per unit
(m.p.u.) volume.
Poor: The code that represents the unit for measuring the m.p.u.
volume.
Y.1.1.6 No Embedded Definitions
The definition of a second
data element or related concept should not appear in the definition proper of
the primary data element.
Good: The text that describes the method used to calibrate the analysis
equipment.
Poor: The text that describes the method used to calibrate the
analysis equipment. Calibration is the
process of rectifying the graduation of an instrument that gives quantitative
measurements.
Note: The term calibration should be defined in an associated glossary
or dictionary.
Y.1.2 Guidelines for Definitions
Highly recommended
guidelines, although not mandatory, are principles that should be followed when
formulating a data element definition.
A definition should:
$ State the essential meaning of the concept.
$ Be precise and unambiguous.
$ Be concise.
$ Be able to stand alone.
$ Be expressed without embedding rationale, functional usage, domain information, or procedural information.
$ Avoid circular reasoning.
$ Use the same terminology and consistent logical structure for related definitions.
Examples of these guidelines
are provided in the following paragraphs.
Y.1.2.1 Essential Meaning of Concept
Include all primary aspects
of the concept, but avoid non-essential characteristics.
Good: The name of a country where mail is delivered.
Poor: The last line of a mail piece that names the country where mail
is being sent.
Note: The poor definition contains extraneous information (i.e., the
line where the country name is placed on a mail piece). This information is valuable to those who
are preparing mail pieces (e.g., letters and packages), but does not serve to
define the data element. This
information might be included in a comment about the data element, or in
business rules applicable to mailing address.
Y.1.2.2 Precise and Unambiguous
The exact meaning of a data
element should be apparent from the definition. Codes that are derived from different standards or identifiers
assigned by different sources must be distinguished.
Example 1:
Good: The 2-character alphabetic code assigned by the
International Standard Organization (ISO) 3166-1 to represent a country.
Poor: The code that represents a country.
Note: Country Codes are assigned by ISO 3166-1:1997, FIPS PUB 10-4,
FIPS PUB 104-1, and ANSI Z39.27-1984.
Some are alphabetic (both 2- and 3-character), and at least one is
numeric. The poor definition is imprecise,
making it difficult to clarify the source of the code and its decode.
Note: The source of standard data values in a domain are documented by
association with the source of those values.
The source is sometimes reflected in the definition, however, so that
there is no misunderstanding as to the source of the data content for the data
element.
Example 2:
Other examples of good definitions
that clearly distinguish between similar data elements are:
$ The commonly recognized, short name that identifies a country.
$ The official name that identifies a country.
Y.1.2.3 Concise
The definition should be
brief and comprehensive. Extraneous terms are to be avoided.
Good: The surname of a person.
Poor: The part of a person's name that describes the surname of a
person.
Note: The person=s surname does not describe the surname - it is
the surname of a person. It is
extraneous to say that the surname is "part of a person's name."
Y.1.2.4 Stand Alone
A good definition must be
able to stand alone, without further definition to understand its meaning.
Good: The Hydrologic Unit Code (HUC) that represents a geographic area that
includes part or all of a surface drainage basin, a combination of drainage
basins, or a distinct hydrologic feature.
Poor: The Hydrologic Unit Code (HUC) code that represents a
cataloging unit.
Note: The term "cataloging unit" does not provide the
understanding that the code represents a drainage basin. For data registries that include a
dictionary or thesaurus, the term cataloging unit should be defined in the
thesaurus.
Y.1.2.5 No Embedded Information
A good definition does not
include embedded rationale, functional usage, domain information, or procedural
information.
Example: The rationale for
using meters instead of feet should not be embedded in the definition.
Good: The distance in meters either above or below a reference surface.
Poor: The distance either above or below a reference surface,
measured in meters instead of feet because meters is an international standard
for measuring distance.
Example: Functional usage
should not be included in the definition (i.e., this data element is [or is
not] used for..).
Good: The code assigned by a state to uniquely identify a facility.
Poor: The code assigned by a state to uniquely identify a facility and to be
used by the state in all data transfer for that facility.
Example: Procedural remarks
(e.g., optionality) should not be part of a data element definition.
Good: The name of the capacity that an organization serves for a
facility.
Poor: The name of the capacity that a company serves for a facility. The role name is used in conjunction with an
organization name in association with a facility.
Note: A data element may have a "Note" or
"Comment" attribute that can be used to capture usage, procedure, and
other explanatory information that is not appropriate to include in the
definition attribute.
Y.1.2.6 Avoid Circular Reasoning
Two definitions should not
be defined in terms of each other. A
definition should not use another concept=s definition as its definition. Examples of poor definitions with circular
reasoning are:
Poor: A code number assigned to an object.
Poor: An object identified by a code number.
Y.1.2.7 Consistency for Related Definitions
A common terminology and
syntax (i.e., consistent logical structure) should be used for similar or
related definitions to facilitate understanding. Where the terminology and syntax is not the same, a user might
assume that there is an implied difference between related definitions.
Good Consistency. The following three definitions represent good
consistency for the code and the name of the method for determining the
vertical coordinate, and also with the name of the method for determining
vertical and horizontal coordinates:
The code that represents the
method used to determine the vertical coordinate.
The name of the method used
to determine the vertical coordinate.
The name of the method used
to determine the horizontal coordinates.
Poor Consistency. The following two
definitions represent poor consistency for code and name of the method for
determining horizontal coordinates:
The name of the method used
to determine the horizontal coordinates.
The code that represents the
method used to determine the latitude and longitude.
Note: Because the terminology is different (horizontal coordinates vs.
latitude and longitude), the registry user might assume that the different
terms have a somewhat different meaning, even though they are simply different
representations of the same concept.
Y.1.3 Data Element Definition Syntax
Only semantic structures of
data element definitions are addressed in ISO/IEC 11179-4. For consistency, a registration authority
might choose to establish syntax rules for the registry, as in the following
example:
C Use a phrase, not a sentence.
Phrase: The name of the country where a mail piece is
delivered.
Sentence: The mailing address country name is the name of the
country where a mail piece is delivered.
Note: The sentence above is not as concise as the phrase, it repeats
the data element name, and adds nothing that clarifies or further defines the
data element.
C Since a data element always includes representation,
begin the phrase that defines the data element by stating the representation
class for the data element and its value domain. The definite article "the" is used, because the
definition refers to a specific data value.
Name: The name of ....
Code: The code that represents ....
Text: The text that describes (or defines)....
Number: The number assigned by (Dun & Bradstreet; Chemical Abstracts
Service; the state) to identify a (business establishment, chemical substance,
legislative district)....
OR The number that represents ....
Measure: The measure of the (distance, area, mass)....
Picture: The picture of ....
Graphic: The graph that depicts ....
Quantity: The (sum, dimension, capacity, amount) of ....
Note: For quantity, instead of
repeating the term "quantity" in the definition, more specific terms
are used to describe the type of quantity for which the data element is
applicable. This avoids the wordiness
of a phrase such as "The quantity that indicates the sum of ...."
Y.1.4 Terms Commonly Used in Definitions
Although not part of the
standard, there are action terms commonly used in definitions that are
frequently misused or mistakenly interchanged.
The terms have similar, but different, meanings that make subtle changes
to the interpretation of the definitions.
These terms might be included in a user manual, to provide guidance for
formulating definitions. The following
are examples of terms that a registration authority might designate to be used
in definitions, according to the meanings provided:
$ Define. To set forth the meaning of a word or phrase.
$ Depict. To represent by, or as if by painting, or to characterize by words with vividness of detail.
$ Describe. To convey in words the appearance, nature, or attributes of something.
$ Designate. To select or nominate for a purpose.
$ Identify. To recognize or establish as being a particular person or thing; to verify the identity of something.
$ Indicate. To show (as by measuring or recording), point to, draw attention to, or make known briefly in a general way.
For definitions to be
precise and unambiguous, the above terms should be used carefully so that the
exact meaning of the concepts reflected by the definitions is well
understood.
Y.2 Representational Attributes
One of the first things to
consider when registering a data element is how the data element is to be
represented in an implementation. The
relational aspects of a data element include the permissible values (i.e., code
sets), value domain, representation class, and examples of data values. The value domain is the set of permissible
values that will be stored in the data element as well as other
representational attributes.
Y.2.1 Permissible Values
Permissible values are the
exact names, codes, and text that can be stored in a data field in an
information management system. For
value domains that are enumerated, permissible values must be entered into the
registry. The permissible values for
country identification in "Short, English-Language Country Name" will
be those names that are listed in the ISO 3166 standard for that category.
The permissible values for
an enumerated value domain are associated with the value meanings (i.e., the
names and definitions that are included in the conceptual domain of possible
values). The entry of value meanings
and their association with permissible values is described later in this Annex
as Y.5.3.
For non-enumerated domains,
the permissible values are those defined by the value domain description/definition and the rule
description, as described in Section Y.2.2.
Y.2.2 Value Domain
The value domain is
formulated, based on an understanding of the data content. A data element is associated with only one
value domain, and the name of the value domain describes all of the data values
that are included in that domain. Value
domains can have the attributes identified in the following list, not all of
which are in the standard. Data
elements referenced in ISO/IEC TR 15452, Information technology,
Specification of data value domains, are indicated with an asterisk (*),
and those additional attributes also referenced in the ISO/IEC 11179-3, Information
technology - Specification and standardization of data elements Part 3: Basic
attributes of data elements, are indicated with a double asterisk
(**).
(Note:
Part 3 defines value domain as "A set of permissible values. It provides representation, but has no
implication as to what data element concept the values are associated with nor
what the values mean.")
$ **Label. The record identifier that represents the value domain. Each value domain must have an identifier, which can be generated by computer software to ensure uniqueness.
$ *Name. The name by which a value domain is known. The name should be plural, since a value domain encompasses all values that are included in the domain (e.g., Short English-Language Country Names). Note that a definition can also be used to describe the value domain.
$ *Character Set. The collective symbols of a formalized writing system for a language used to intelligibly communicate data. The descriptor >character set= of a data element attribute is valid at the data element dictionary level and shall be explicitly stated in case of interchange among dictionaries. If one or more of the data element attributes uses a character set that differs from the set generally used for the complete data element dictionary, than the descriptor >character set= shall be specified. Examples of character sets are "ASCII" (i.e., consisting of 128, 7-bit characters) and "EBCDIC " (i.e., consisting of 256, 8-bit characters).
For the examples described
in this technical report, the character set does not need to be specified.
(Note:
There is a discrepancy between TR 15452 and Part 3 regarding character set. TR 15452 indicates that character set can be
"alphabetic character" or "numeric character," both of
which are described as "datatype" in Part 3. Part 3 defines character set as in the above
paragraph.)
$ **Datatype. The format used for the collection of letters, digits, and/or symbols, to depict values of a data element, determined by the operations that may be performed on the data element. Datatypes are characterized as language independent. They do not follow any particular Database Management System (DBMS) or software language. The standard does not specify the datatypes to be used for the value domains. They must be established by the registration authority. The registration authority might choose to record datatypes in context (e.g., ORACLE or COBOL), in which case the context for the datatype should also be recorded.
An alphanumeric datatype is
composed of either alphabetic characters, numerals, or both. A numeric datatype is composed of
numerals. In general, values that are
intended to be sorted, whether numerals or alphabetic characters, are described
as "alphanumeric." Only
numbers that are used in calculations are given the datatype of
"numeric." The character set
for date (i.e., day of a calendar year) has been identified as "date,"
and whole numbers as "integers."
When creating metadata for more complex datatypes (e.g., arrays and bit
strings), ISO 11404 provides guidance on datatypes.
$ Domain Type. Value domains are either enumerated or non-enumerated:
(Note
that TR 15452 addresses enumerated domains only. Part 3 describes enumerated and non-enumerated domains, but does
not provide for an attribute to distinguish between them.)
Enumerated domains are those
for which all values can be explicitly expressed in a structured or unstructured
set. Structured sets (e.g., taxonomies
or thesauri) are not addressed in this document. Country names are a fixed list of countries, maintained by
international standards; therefore, the domain type is enumerated.
Non-enumerated domains have
an unspecified set of values. The
values, however, must fall within the scope of the definition. Latitude measures are not restricted to a
fixed list. Therefore, the domain type
is non-enumerated. A non-enumerated
domain must be described by exactly one "non-enumerated domain
description."
$ **Value Domain Description/Definition. Non-enumerated domains must include a textual description of the potentially valid values to be stored in the data element.
$ **Non-enumerated domain description. A designation of procedure or rule for a set of all permissible values for the value domain or the upper and lower limit to a value domain range. The non-enumerated domain must be described as one of the following:
- Procedure.
Measurements and quantities are determined by procedure (e.g., they are
calculated, measured, or generated).
- Reference.
Telephone numbers and facility names are determined by reference (e.g.,
they can be validated in some type of directory).
- Range.
Percentages and temperatures are
examples of range determinations.
Maximum and minimum values are always required for range
determinations. Examples: 1‑100% and 32-212oF.
$ **Rule description. The rule is the logical, mathematical, or other operation that specifies the derivation for a data element. The rule description specifies the derivation of the data element values. For non-enumerated value domains, the rule description describes the procedure, the reference, or the maximum and minimum values for the range that limits the permissible values for a data element.
$ *Maximum and minimum field lengths.
For non-enumerated domains,
the minimum length can be as small as one; the maximum length must be adequate
to accommodate the largest, reasonable amount of data for that value domain
(e.g., the maximum length for a text field might be 240 characters).
For enumerated domains, the
actual permissible values determine the minimum and maximum field lengths. For a 3-digit code, both the minimum and
maximum field lengths are three. For
short, English-language country names, the minimum length is 4 (e.g., Peru or
Oman) and the maximum length is 44 (e.g., South Georgia and the South Sandwich
Islands).
$ *Format. The format is a template for the structure of the elements of a value domain. A registry might adopt its own format for displaying data element format, independent of the DBMS or software language. For example, alphanumerics might be depicted as A(n), where "A" represents alphanumeric and "n" is the maximum field length for the data element value. Numerics might be depicted as N(n.d) where the data value has n-digits to the left and d-digits to the right of the decimal point. Integer format might be depicted as I and date as D. The format must distinguish between integers, decimal marks, and floating point notations. It must also reflect any embedded punctuation in the stored data element. Note that ISO 6093 provides guidance on formats.
$ **Unit of Measure. Some value domains require that values for a data element be measured in only one unit (e.g., a requirement that altitude be measured in meters). This attribute contains the name of the unit of measure for all data values for the value domain.
$ **Precision. Where the value for a data element must be measured or recorded according to a specific level of precision, that information is recorded in the precision attribute (e.g., a requirement that the molecular weight for a chemical substance be recorded to two decimal places). Examples of value domain identifiers (i.e., labels) have been assigned to the examples provided in Annex F to demonstrate uniqueness and reusability of the value domain.
Y.2.3 Representational Terms
Representation is the form
of expression of the data element.
Representation and value domain together provide the data element
representation. Representation terms
are used to describe the form of representation of a data element. An informational list of representation terms is provided in ISO/IEC
11179-5. The list has been expanded in
this document to provide a more comprehensive list of examples that might be
used to describe representation classes, including the following:
$ Amount. The sum total of two or more quantities; an aggregate.
$ Code. A symbol used to represent something.
$ Graphic. Diagrams, graphs, mathematical curves, or the like.
$ Icon. A sign or representation that stands for its object by virtue of a resemblance or analogy to it.
$ Measure. The extent, dimensions, quantity, etc. of something ascertained by comparison with a standard.
$ Name. A word or combination of words by which a person, place, object, or thought is known.
$ Number. A numeral or group of numerals.
$ Picture. A visual representation of a person, object, or scene.
$ Quantity. The property of magnitude of something.
$ Text. A unit of connected speech or writing often composed of one or more sentences that form a cohesive whole.
Y.2.4 Example
Each set of metadata
attributes for a data element includes an example of the kind of data value
that can be stored in that data element.
Data element names and definitions are always defined as singular;
therefore, examples are always singular.
More than one example can be used, however, where necessary to
illustrate the value domain. The
example can be a name, text, code, number, or any of the data representations
described in the value domain. The
following rules apply:
_ For enumerated domains, the data element example must be one of the permitted values for that value domain.
Example for "Country
Name": Australia
When the representation for
the data element is a coded value, a registration authority might choose to use
one of the permitted values for the code as the example, followed by the value
meaning name, enclosed in parentheses.
Example for "Country
Numeric Code": 036 (Australia)
_ For non-enumerated domains, the data element example must be representative of the data that complies with the definition of the value domain.
Example for "Latitude
Degrees Measure": 87.123456
Example for "Location
Comments Text": The coordinates reference
the flag pole in the North parking lot of the installation. This location is near the center of the
facility.
Y.3 Identifying and Naming a Data Element
The data element name can be
constructed, based on the value domain values and the data element definitions.
Names are not used as
identifiers for data elements, but as designators that enable humans to refer
to a data element. The definition is
the attribute that provides a full understanding of the data element, and the
data identifier, version identifier, and registration authority identifier
together uniquely identify a data element, as described in ISO/IEC 11179-5.
Every data element must have
at least one name, and each name must be identified with a context. Each
context (e.g., source of a data element name) can have its own naming
convention. Rules for formulating a
data element name are dependent upon the registry in which the data element is
registered. An example follows in Section
Y.3.3.
Multiple names may be
appropriate for a data element based on the intended use for the data
element.. Contexts for names are
described in Section Y.3.1. Each data
registry establishes its own naming convention. Suggestions for establishing a naming convention are provided in
Section Y.3.2.
Y.3.1 Name Context
Context names are not listed
in the standard. Examples of name
contexts that might be used for a registration authority include:
$ Legacy - a name that has been used in the past.
$ Standard - a name that has been used in a standard (e.g., ANSI, ISO, or other standard).
$ Short Abbreviation - a name that is used in a computer system.
$ <source system name> - the name that is used by the source that submitted the data element for registration.
$ Registry - the unique name that has been assigned to the data element for registration by a registration authority.
The multiple names for a
single data element might be the same or different names, depending upon their
contexts. The names in context are
often associated with definitions for that context. The definitions must state the exact same concept for the data
element as the registry definition, even if they are defined in different
terms. Examples of non-unique names and
definitions, associated with the same data element but stating the same
concept, are listed as follows:
Registry: Vertical Measure.
The vertical measure, in meters, of the measured point, above or below a
reference point.
Legacy: Vertical Measure.
The measure of elevation (i.e., the altitude), in meters, above or below
a reference datum.
Standard: Altitude. The vertical
distance in meters either above or below a reference surface.
It is clear when reading
these three definitions, that the concept is the same for all (i.e., the
measure of the height (or depth) of an object above or below some point of
reference). The following definition
would not be appropriate, because it would convey a different concept:
Facility Altitude. The height or depth of a facility relative
to sea level.
This definition includes the
concept of "facility," whi ch limits the objects where measurements
are appropriate; "sea level," which limits the point of reference for
the measurement; and it does not restrict the unit of measure to meters. The last data element described (i.e.,
Facility Altitude) is not the same data element as was the previous example of
Vertical Measure/Altitude.
Note: Part 3 of ISO/IEC 11179 includes an attribute for "Unit of
Measure" in the value domain of the metadata registry. This is the appropriate attribute to
indicate the unit by which the data value is to be recorded. In a standard developed by the American
National Standard Institute (ANSI), however, unit of measure was included in
the definition, so it has been replicated in this example. The metadata registry model also includes an
attribute for the precision required for recording the data value.
Y.3.2 Establish a Naming Convention
The Registration Authority
(RA) should establish a naming convention for each name context in the
registry. Where data element names are
provided from other sources, the naming convention may not be fully known (e.g.,
the names assigned to data elements in an application software system). The
naming convention shall be constructed according to ISO/IEC 11179-5 naming
conventions, as explained in the following paragraphs.
$ The Scope of the Naming Convention. The scope of the naming convention determines how broadly the naming convention is applied. For the example registry described in this document, the scope is limited to the Registry name context. For example, a data element might have the name ARegulation Abstract Text@ with the context ARegistry@ and the name AAbstract@ in another context. The conventions used for names in contexts other than for the Registry name context may not be known to the registration authority and the naming convention would be documented as Aunknown.@
$ The Authority That Establishes Names. The RA establishes the Registry Names for a registry. The Environmental Data Registry (EDR) has as its RA the Environmental Protection Agency (EPA). The data steward appointed by that agency is the final authority for the assignment of names. Other registries will establish their own RA's.
$ Semantic Rules for Source and Content of Terms. Semantic rules enable meaning to be conveyed. Each registry shall specify the guidelines used, if any, that govern the source and content of words used in a name. Name components may come from object class terms, property terms, representation terms, and qualifier terms. These terms may be part of a thesauri or terminology system. The logical group or entity where a data element might be modeled and the conceptual domain where the data values are defined and maintained can be used as source terms in a data element name. The naming convention for some name contexts might specify that the data element name is simply what the data element is commonly called in the organization, and that no semantic rules are enforced.
$ Syntactic Rules for Word Order. Syntactic principles specify the arrangement of components within a name. The specific syntactic rules for a registry, if any, should be specified in the naming convention. In the examples in this document, the convention for syntax for the Registry name context is to include the representation class term as the last term in the name, as in Regulation Abstract Text. Representation class terms are defined in Section Y.2.3 of this Annex.
$ Lexical Rules. These principles concern preferred and non-preferred terms, synonyms, abbreviations, component length, spelling, permissible character set, case sensitivity, and similar rules. Rules for these subjects, if any, are part of the specifications of the naming convention. A RA might choose to establish controlled, well defined word lists for formulating a name.
$ Name Uniqueness. Each registration authority determines whether a name within a context must be unique. Because users often rely on names as an indication of data values, qualifiers may be used to distinguish similar data elements within a registry (e.g., Horizontal Collection Method Code and Vertical Collection Method Code; Mailing Address Country Name and Geographic Address Country Name).
Y.3.3 Example of a Naming Convention
An example of a naming
convention for the context "Registry Name," and its adaptation for a
specific RA is provided in this section.
For this example, registry name is considered to be the official name by
which a data element is registered in a specific registry.
$ Scope. The scope of this example naming convention is for use in the example registry. Each data element must be assigned a "Registry Name". It is not intended to be the official or preferred name for the organization or industry.
$ Authority. The authority for this example is the U.S. Environmental Protection Agency for its Environmental Data Registry.
$ Semantic Rules. Names shall include a term that indicates the type of values that will be stored in that data element. For example, a data element that represents a domain of Country Identifiers, should have the term ACountry@ in its name. Qualifiers shall be used to differentiate between names that would otherwise be the same. The representation class term shall always be included as the last term in the name.
$ Lexical Rules. A data element name in the example registry shall have a maximum of 100 alphanumeric characters. The language of the registry shall be English, and the character set ASCII. There are no controlled word lists.
$ Name Uniqueness. Names shall be unique within a registration authority for the context Registry.
Y.3.4 Formulating a Data Element Name
The examples used in this
document are based on a naming convention for name context Aregistry,@
established by one registration authority.
The example requires that the data element name be constructed to
reflect both the logical entity which includes the data element (i.e., the
object) and the attribute which identifies the type of data value to be
contained in the data element (i.e., the property). Although the entity is not always required to be a term in the
name, the attribute (i.e., type of data value) is a requirement. For the registration authority used in this
example, data element name would always include the representation class term,
such as name, measure, amount, number,
code, quantity, text, or others, as defined in Section Y.2.3.
The data element names in
the following Exhibit 5.1 are provided as examples of names to be found in one
registry, with the context Registry Name.
The table columns identify the name components. Syntactic rules for name are relative. The only rule in this example is for syntax;
the representation should be the last component in a name.
|
Object |
Property (Data Values) |
Representation |
Qualifier |
Resultant
Data Element Name |
|
Primary Geopolitical Entity |
Country Name |
Name1 |
|
Country Name |
|
Address |
Country Name |
Name1 |
Mailing |
Mailing Address Country Name |
|
Address |
Country Name |
Code |
Geographic |
Geographic Address Country Code |
|
Address |
Person Name |
Name1 |
Mailing |
Mailing Address Person Name |
|
Facility |
Legal Name |
Name1 |
|
Facility Legal Name |
|
Geographic Coordinates2 |
Latitude |
Measure |
|
Latitude Measure |
|
Location |
Latitude |
Measure |
Facility |
Facility Location Latitude Measure |
|
Location |
Latitude |
Measure |
Stack |
Stack Location Latitude Measure |
|
Geographic Coordinates2 |
Collection Method |
Code |
Horizontal |
Horizontal Collection Method Code |
|
Geographic Coordinates2 |
Collection Method |
Code |
Vertical |
Vertical
Collection Method Code |
1 "Name Name" is redundant, so only one "Name"
is used in the data element name.
2 "Geographic Coordinates" is an implied entity not included in
the data element name.
Exhibit Y.1. Data Element Names
Y.4 Identification
Y.4.1 Data Element Identifier and Identifier
Part 5 of ISO/IEC 11179
gives principles for naming and identification of data elements. Each data element registered within a
Registration Authority (RA), i.e., an
organization authorized to register metadata, is unambiguously identified with
a unique identifier. At the time a data
element is registered into a metadata registry, a Data Element Identifier (DI)
is assigned to the data element. When a
data element is first registered, it is assigned a Version Identifier (VI) of
"1". The version number is
incremented by "1" for each subsequent change to the data
element. The DI and VI can be assigned
by the system software when a data element is registered in the registry (i.e.,
a new data element record is created in the system). Each registration authority should develop business rules for
versioning data elements and their attributes.
The combination of RAI, DI,
and VI shall constitute the International Registration Data Identifier
(IRDI). This identifier provides unique
identification to a data element internationally. For the examples listed in Annex F, DI and VI have been recorded
to demonstrate uniqueness.
A registration authority
might require certain associated administrative information for a data
element. Some attributes are specified
in the standard (e.g., registration status).
Others are determined by the registration authority. Examples of administrative attributes that might
be established by a registration authority are described in this section. No administrative data attributes have been
assigned to the examples described in the text of this document or in the table
provided in Annex F.
Y.4.2 Versioning
Data elements in a metadata
registry are generally entered in sets associated with a document, a standard,
or an application system. In many
cases, a data element may be changed or a data element source like a document,
a standard, or an application system may change and a new version may be
required.
One approach to tracking
changes is to enable database transaction logging which automatically captures
the date and time of all changes. The
drawbacks to that solution are that all changes, whether significant or
insignificant, are logged, and additional processing and space resources are
required to retain all versions of a data element. In addition, just logging
the date and time of a change doesn=t effect a version change. The alternative is to manage version
information in database fields that are updated by data analysts who make
judgments according to a set of business rules. It is generally understood that the business rules will initially
be implemented by analysts.
Following are draft business
rules to guide versioning of various objects in a metadata registry.
1. The following objects need to be versioned: group, standard,
document, system, data element, value domain.
2. Value meanings and permissible values would not be versioned
as part of a value domain, but begin and end dates will document changes to
these values.
3. Any version change to a permissible value would result in a
new version of a value domain. Begin
and end dates would be stored.
4. Any value domain changes would result in the need to review
related data elements to determine whether or not they should be
versioned. In some cases, a decision by
a steward or working group would be required to affirm that a data element
would adopt the new version of the value domain.
5. In order to ensure that versioning is effectively applied,
it cannot be decided by software, but requires interpretation of the business
rules by a data analyst. Versions would
be incremented only for non-trivial changes (not typos). In some cases, the data steward and the
registrar would need to agree on changes.
6. Data elements would be versioned based on changes to
definition or representation or format.
7. Changes to data elements within a group would result in
incrementing the version of the group.
8. All changes made to data standards require some
documentation of authorization. This
could be indicated within a text field for each standard.
9. Typographical changes (errata) would require a notification
process. More substantive changes may
require balloting or a consensus process to approve the changes. This approval could be recorded as a new
document in the registry, and could be cited as the source for the new versions
of the data elements.
New data element versions
would be indicated by incrementing the version number associated with the
identifier. This is a new physical
record for the data element, and the registry would continue to store the
earlier versions (i.e., both 6125:1 and 6125:2).
Y.5 Conceptual Relationships
Data element concepts,
conceptual domains, and value meanings are described in this section.
Y.5.1 Data Element Concept
The data element concept is
readily derived, based on the name and definition of the data element. It is a concept that can be represented in
the form of a data element, described independently of any particular
representation. The data element
"Country Name" is a representation of the data element concept
"Country Identifier."
The following list is
provided as guidance for terms that might be used in names and definitions of
data element concepts. Terms that do
not denote representation include the following:
$ Identifier. Something that represents to be, regards, or treats as the same or identical.
$ Label. A short word or phrase descriptive of a person, group, or intellectual movement, or indicating that what follows belongs in a particular category or classification.
$ Tag. A descriptive word or phrase applied to a person, group, organization, etc., as a label or means of identification or epithet.
$ Indicator. Anything that serves to point out or direct attention to, as of a measuring device.
$ Discriminator. A distinction that differentiates one from another.
The data element concept is
the concept for which the conceptual domain contains representative
values. The following list of
characteristics is provided as guidance to ensure consistency in formulating
the names and definitions of data element concepts:
$ Singular. Each data element concept represents only one concept.
$ Does not include representation. It does not incorporate the representation terms such as name, code, text, number, or other terms that denote how the concept can be represented in either the name or the definition of the concept.
_ Indefinite article. The definition is stated with the indefinite articles "a" or "an" since the concept does not specify a particular data value or representation.
_ Can be associated with multiple data elements, each with its own representation and value domain.
ISO 3166, for example,
represents the data element concept "Country Identifier," which can
be represented as names, or it can be represented by codes (e.g., "Country
Name" or "Country Code").
There are more than one name and more than one code associated with the
concept for "Country Identifier."
Each name and each code requires its own data element and value domain.
_ Can be associated with only one conceptual domain.
The appropriate level for
exchanging data values is the conceptual level, through data element concept
and conceptual domain. The value
domains of country codes and country names are translatable, where the value
meanings associated with the conceptual domain reference the same data element
concept for countries of the world.
A data element concept
identifier can be created by the system software, to provide unique
identification and versioning for data element concepts, and an identifier that
can be used to indicate the domain for translation of data values.
Y.5.2 Conceptual Domain
A conceptual domain is a
perception template of understanding that might be an enumerated set of
meanings. A data element concept uses a
conceptual domain to constrain its perception meaning. An enumerated conceptual domain is a set of
all possible, valid value meanings of a data element concept expressed without
representation. The conceptual domain
for the "Country Identifier" data element concept is the collection
of all the value meanings that can be used to identify all of the countries of
the world.
Characteristics of
conceptual domains include:
_ Plural. Whether enumerated or non-enumerated, a conceptual domain includes the entire body of information that might be included as meanings of the data values in a particular data element for a particular concept. Therefore, the name and definition are always described as plural.
_ Object oriented. The name is used to identify the component contained in the conceptual domain. It does not require a property identifier or an object class. For example,"Countries of the World" includes the identification of all countries.
_ Lacking representation. The definition identifies the type of information that a conceptual domain encompasses, without using representation class terms such as code, name, text, number, picture, measure, quantity, and identifier. For example: "Countries of the World" is defined as "The primary geopolitical entities of the world," not as "The names of the primary geopolitical entities of the world."
_ Conceptual domains can be, and often are, associated with more than one data element concept. Data element concepts that "Countries of the World" could be associated with include, but are not limited to:
- Address Country Identifier.
- North American Country Identifier.
- NATO Country Identifier.
- Geographic Country Identifier.
A conceptual domain can be
associated with any data element concept that uses the same value meanings
(e.g., United States, Canada, and Mexico are value meaning names for both the
Address Country Identifier and the North American Country Identifier concepts). Different value meanings require a different
conceptual domain. For example, in a
database about countries, a data element that contains information about a
country other than country identification (e.g., size, type of government,
economic activities) would have its own conceptual domain.
A rule for determining if a
data element concept can be associated with a conceptual domain is to consider
the value meanings associated with the conceptual domain. Names such as Frigid, Tropical, or Temperate
could be permissible values for a conceptual domain about geographic zones
where countries are located, but they cannot be defined as "The principal
geopolitical division of the world known as <country name>." They would not be associated with the
conceptual domain "Countries of the World."
Where the content of the
value meanings is the same for more than one data element/data element
concept/value domain, the conceptual domain can be reused for multiple data
element concepts as described previously in this section. Conceptual domain identifiers have been
recorded for the examples provided in Annex F to demonstrate uniqueness and
reusability.
Y.5.3 Value Meanings
Every enumerated conceptual
domain is associated with more than one value meaning. A value meaning is the meaning (description)
of a permissible value that will be stored in a data element. Value meanings can have both name and
definition. Often the "name"
of a value meaning becomes the permissible value of that value meaning in a
data element with "name" representation. Characteristics of value meaning names and definitions are:
$ Cannot be a representation. The name and definition do not contain representation class terms such as name, number, text, code, or other representation terms.
$ Must be associated with at least one conceptual domain.
$ Can be associated with more than one conceptual domain.
Example 1: Value meaning names associated with the
conceptual domain "States of the United States" is also associated
with the conceptual domain "Data Collection Sources" in one data
registry.
Example 2: The value meaning
name "Unknown," indicating that the data value for a particular data
element is not known, can be associated with many conceptual domains.
$ Begin and End Dates. The dates when a value meaning was entered into a conceptual domain and when a value meaning was no longer valid for a conceptual domain are required in a data registry.
$ Unique Identifier. Each value meaning has a unique identifier (VMID) in a registry. The VMID and the data element unique identifier (IRDI) provide unique identification of a particular data element item occurrence. This combination of identifiers is valuable for data transfer.
In addition, the value
meaning should be singular. Each value
meaning represents one instance of the meaning of a value to be found in a data
element.
Y.6 Classification
Classification helps to add
information not easily included in definitions, helps to organize the contents
of a metadata registry, and helps to provide access by supporting more
meaningful
queries. Part 2 of ISO/IEC 11179 describes general
categories of classification; Part 5 describes three classified components:
object class, property, and representation class. An object class term represents an activity or object in a
context. Property terms are terms that
modify an object term. Representation
class terms describe the form of representation. Representation terms are described in Annex Section Y.2.3.
A metadata registry might
choose to classify data elements as groups, e.g., the group of data elements
used in a mailing address, the group of data elements used to identify chemical
substances, or the group of data elements that locate a point on the surface of
the earth.
Keywords might also be used
to classify data elements, e.g., altitude, date, facility, industrial, and
organization.
Y.7 Quality Review
As metadata for data
elements are completed, the data element progresses through a review process to
standardization, where appropriate. The
Registration and Administrative Statuses indicate the status of a data element
in the registration/standardization process.
Y.7.1 Registration Status
The standard values for
registration status include the following:
$ Incomplete. The data element does NOT have all the necessary metadata.
$ Recorded. The data element has all the necessary metadata, but has NOT met all the quality requirements.
$ Certified. The data element has all the necessary metadata and has met all quality requirements.
$ Standard. The data element has all necessary metadata, has met all quality requirements, and has been approved by the Registration Authority.
$ Retired. The data element is no longer used in the registry.
The registration authority
might also choose to use Legacy as a registration status:
$ Legacy. The data element was obtained from a Legacy System and may be missing some metadata. It has not been considered for standardization.
The registration status for
a new data element is always listed as "Incomplete" until such time
as all attributes associated with that data element are completed. After all of the data element attributes have
been verified to be complete, the registration status is changed to
"Recorded." Other status
changes are determined by the registration authority.
[1]American National Standard
for Information Technology, Metamodel for the Management of Shareable Data,
February 20, 1999, ANSI X3.285:1999, proposed as ISO/IEC 11179, Part 3 replacement.
[2]The ITIS is a partnership of U.S., Canadian, and Mexican agencies, other organizations, and taxonomic specialists cooperating on the development of an on‑line, scientifically credible, list of biological names focusing on the biota of North America. ITIS uses the five kingdom system for identification and assigns taxonomic serial numbers to each taxonomic level in an identification. ITIS is meant to serve as a standard to enable the comparison of biodiversity datasets, and therefore aims to incorporate classifications that have gained broad acceptance in the taxonomic literature and by professionals who work with the taxa concerned.