ISO/IEC 11179 Metadata Registries are intended to span between conceptual models (semantic specifications, data element concepts) and information artifact specifications (data element specifications). These registries are being used to facilitate interoperability among databases, datasets, messaging systems, and various applications programs. Historically, they have been used to communicate semantic information about information artifacts among various human database and applications designers and developers (and sometimes end users). Developments in agents, inference engines, and web services technologies have generated increased interest in using ISO/IEC metadata registries to convey semantics information for use by other programs (machine processing). Such uses require more precise and more formal descriptions of semantics. Here we discuss proposed enhancements in the modeling of relationships for ISO/IEC 11179 Metadata Registries, elsewhere we discuss enhancements in capturing ontologies and formal axioms.
Here we discuss a series of proposed changes to ISO/IEC 11179 Part 3 Revision 3 intended to provide better support for recording information about relationships in ISO/IEC 11179 metadata registries. This is part of an effort to expand the scope of ISO/IEC 11179 beyond data elements and data element concepts to encompass terminologies, thesauri, database schemas, and ontologies. Each of the individual changes will be discussed in separate ISO/IEC 11179 issue documents. This document is intended to provide an overview of the proposed changes.
ISO/IEC 11179 metamodel has two types of relationships: built-in relationships (prespecified relationships among existing metamodel "entities"), and user-specified relationships (used primarily to specify "semantic relationships" among "concepts". User-specified relationships have been present in ISO/IEC 11179, but not in a consistent systematic way. Better representation and specification of user-specified relationships serve several possible purposes in ISO/IEC 11179:
It is clear that the first use is essential to any reasonable effort to expand the ability of ISO/IEC metadata registries to encompass more extensive semantic objects: dictionaries, terminologies, thesauri, taxonomies, ontologies. The latter two applications are somewhat more controversial. The use of relationships in support of schema-specific relationships raises issues of the scope of ISO/IEC 11179 (i.e., scope creep) and its relationship to other metadata registry standards, notably the MOF and CWM standards from OMG. Examples of such usage include the caDSR inclusion of UML diagrams. The third use of relationships (as an extension mechanism for ISO/IEC 11179) is also controversial, as some members of the L8 and SC32 WG2 believe that such measures will introduce confusion into the normative usage of ISO/IEC 111179 and lead to multiple approaches to recording standard relationships among existing metadata entities.
In data modeling and knowledge representation, relationship and relation are often used nearly synonymously to refer to "a logical or natural association between two or more things" or to a "an abstraction belonging to or characteristic of two entities or parts together". In short, "relationships" are the connections among things (objects, entities) in a data model (schema). Relationships are defined with respect to several arguments (a.k.a. roles or sometime attributes). The number of arguments (roles) of a relationship is known as its "arity". Binary relationships have arity two (for example "mother-of" is a binary relation defined over the arguments (roles) of mother and child). N-ary relationships are the generalization of binary relationships and permit relationships of arbitrary arity. Hence, between(a,b,c) is a ternary relationship in which b lies in between a and c. Binary and n-ary relationships are discussed further below. Individual subsections discuss each of the following topics:
Some persons have expressed a desire to use the same relationship meta-model throughout ISO/IEC 11179 to model relationships among:
The rationale for allowing the specification of user-specified relationships among a variety of metadata entities in a registry is that this provides a means of extending the basic ISO/IEC 11179 to encompass additional, unanticipated relationships among the metadata entities.
However, others, e.g., Kevin Keck, have expressed concerns that a general user specified relationship capability in the ISO/IEC 11179 might lead to confusion and uncontrolled diversity of implementations and usage of ISO/IEC 11179 metadata registries in that users may use user-specified relationships to "re-implement" relationships already specified in the ISO/IEC 11179 metamodel.
Note that if we only want "relationships" for semantic relationships among "concepts" then we can much more tightly constrain the domain and ranges of relationships. However, this would sharply constrain the utility of "relationships".
We want to make relationships to be a kind of administered data item, so that we can keep track of the various provenance and versioning metadata attributes. Recall that "administered items" in ISO/IEC 11179 provide a means to record the provenance, owner, last modification time, etc. of an administered item. One of the issues raised by the desire to make individual relationships into administered items is the burden this places on the registration staff to generate and maintain all of the administrative information.
Since we expect that users of ISO/IEC 11179 metadata registries will want to encode schemas from UML, ER, CWM, and relational data models we would like to assure that our relationship meta-model is sufficiently expressive. Note that the caDSR registry from the National Cancer Institute has begun to register UML diagrams (schema fragments). Note that there is an existing OMG CWM (Common Warehouse Metamodel) for capturing UML, and other types of schemas. We should probably specify how schema relationship modeling in ISO/IEC 11179 will relate to that of OMG's CWM.
Relationships hold amongst their arguments (or roles). For example, motherhood is a binary relationship between a "mother" and a "child". Here "mother" and "child" are the roles (arguments) of the relationship "motherhood". Sometimes (e.g., in relational database schemas) roles are called attributes. Observe that the "object" which occupies the "mother" role must be a female person (in a typical application - in zoology the "mother" might a female animal), whereas "child" can be a person of either sex. These are examples of type restrictions on roles.
In RDF Schema (W3C's Resource Description Framework Schema) and many other modeling formalisms, it is possible to restrict the types of the "things" referenced by relationship roles. This can be quite important as an aid for specifying semantics of relationships. In ISO/IEC 11179 we have yet to formulate a clean, universal type system for all sorts of things in the metamodel. Hence, we will have to fix up ISO/IEC 11179 type system in order to have types on the referents of relationship roles. At present ISO/IEC 11179 only includes "data types" (e.g., integer, string, etc.) for data elements, not for data element concepts - i.e., "types" are seen as part of the representational specification, not the semantic specification. We regard this stance as inadequate - since integers have different semantics (addition, subtraction, and multiplication operations) than strings.
It is almost universally the case that one wants to specify cardinality constraints on the various roles in a relationship. Commonly, one specifies the minimum and maximum cardinality allowed on each end of a binary relationship. Thus one may speak of binary relationships being: one-to-one, one-to-many, many-to-many, or man-to-one. Minium cardinality of zero indicates that a relationship is optional, while minimum cardinality of one (or more) indicates that a relationship is mandatory. Thus motherhood is a one-to-many relationship, with minimum cardinality on the child role of zero (some women do not have children). In American society "currently-married-to" would be a one-to-one relationship (since bigamy is illegal). Similar facilities are available in UML, XML, and Entity-Relationship modeling tools. Hence, we need to include this in our relationship model, and we need to do this properly for n-ary relationships. In this case of n-ary relationships some care is needed in specifying the definitions of the cardinality constraints. Cardinality constraints are generally not considered controversial.
Specification of additional constraints on relationships is useful to enhance the precision of the semantic specification of the relationship. This is useful for both humans and machine processing. Among humans explicit integrity constraints on relationships can reduce opportunities for misunderstandings across time, space, and cultures. Machine processable specifications of additional integrity constraints permits mechanical checking of such constraints, and is often used to prevent database modifications (e.g., data entry) which would violate such constraints, for example entry of a misspelled state of residence (or birth).
There are a variety of other integrity constraints which one may wish to describe for relationships, such as functional dependencies, inclusion dependencies, etc. (described below). It would be desirable to able to specify these in the relationship metadata. Traditionally, these sorts of constraints have been specified with respect to "information artifacts", e.g., relational database design, rather than "conceptual models". However, we believe that these constraints reflect semantics of the conceptual models, not merely information artifacts. Note that the specification of such relational constraints, e.g., inclusion dependencies as foreign key constraints in relational DB designs are often framed in terms of the information artifacts (relations), rather than the semantics of the conceptual relationships being modeled.
Functional dependencies are often described (in relational DB designs) as "key constraints", i.e., there is a functional dependency from the key attribute to all of the non-key attributes of the relation. Inclusion dependencies are often described as "foreign key constraints" and specify that a given "foreign key" attribute must be "included" in some other specific relation. Note that "inclusion dependencies" often appear in metadata registries as specifications that a particular attribute must be an element of an enumerated value domain. Such enumerated value domains (perhaps a controlled vocabulary) are usually assumed to be only slowly or infrequently changing. In a more general relational database setting, inclusion dependencies may specify more dynamic enclosing sets - e.g., a personnel register.
ID dependencies are used to specify that one data element's attributes form part of the key of another (dependent) data element. This often occurs when one wants to model nested entities such as countries, states, counties, and cities. One needs all of the attributes of the nested entities to specify a "key" for the most dependent data element (here cities). Note that the ID dependencies specify a type of "part-of" relationship.
Explicit specification of functional and inclusion dependencies has not been incorporated into ISO/IEC 11179 Edition 2.
There are a wide variety of semantic relationships in use in various terminologies, thesauri, taxonomies, anatomies, and ontologies. It would be desirable to provide a list of standard semantic relationships, which data modelers could use (unless they have already adopted a set). This would greatly ease the integration of various metadata sets.
At a minimum we will require the following:
We may also want to adopt standard relationships for:
Note that a number of major ontologies contain standard sets of semantic relationships, e.g., UMLS semantic network. Often, however, these "standard semantic relationships" are very domain specific.
The exact set of relationships to be included in a standard set of semantic relationships has yet to be settled.
We need a set of examples of how the proposed relationship modeling facilities would be used. Such concrete examples are useful to communicate the modeling methodology practice. [This is still to be done ...]
Many object models assume that the attributes (roles) of an N-ary relationship are ordered. The semantic significance of such orderings is unclear [to F. Olken]. In UML roles are not ordered, but in OMG's CWM (Object Management Group's Common Warehouse Metamodel) roles are ordered. So far as I [F. Olken] can determine the only reason for ordering attributes (roles) is that this is sometimes useful in the physical design of relations in relational DBMSs. It is also useful preserving relationship descriptions which undergo round trips through a modeling tool - the attribute order does not change. Note that order relationships among "elements" are ubiquitous in XML and need to be preserved.
Binary relationships are widely used in terminologies, thesauri, and many ontologies. Common categories of binary relationships include: taxonomic (is_a), meronymic (part_of), .... See discussion below of standard semantic relationships.
Properties of binary relationships include:
Binary relationships can also be characterized in graph theoretic terms:
Note that RDF query engines only support binary relationships. Note also that XPath supports a binary relationship model [Kevin Keck].
See discussion below about representing n-ary relationships as n binary relationships.
In the most recent proposed revisions to the ISO/IEC 11179 metamodel symmetric and asymmetric binary relationships have become subclasses of binary relationships. This reflects a decision by K. Keck to model symmetric relations has having only a single "role" which is used twice in two "role ends" entities to model the pair of symmetric roles. At present the proposed revised metamodel does not yet include the graph theoretic characterization properties described below.
N-ary relationship are the extension of binary relationship to higher arity (more attributes/roles). N-ary relationships are commonly used in various database models and schemas, including UML, ER models, and relational schemas and in various ontologies. An example of a ternary relationship is Between(a,x,b), meaning that either a<x<b or a>x>b.
N-ary relationships can be decomposed into binary relationships (cf. the NIAM data model, or more recent work with RDF), but the decomposition tends to be quite verbose. Thus n-ary relationships usually permit more concise schemas / data models / ontologies. N-ary relationships have been proposed for ISO/IEC 11179 in order to directly support registration ontologies which include n-ary relationships.
It is commonplace to represent n-ary relationships as n binary relationships. However, there is some philosophical debate about whether this entails some loss of semantic fidelity - see for example work by Charles Pierce on "thirdness".
Note that SQL directly supports storing and querying n-ary relationships. In a relational database, an n-ary relationship can be directly represented and queried via SQL as a single relation (table) of n columns (attributes). Thus one might have a relation "parentage", with 3 columns, respectively, father, mother, child. This is an example of a ternary relationship.
Here the central issue is to define suitable versions of the binary relationship properties (such as reflexivity, symmetry, transitivity) which make sense for n-ary relationships. A common extension of binary relationships would be to add a third argument representing context in which the relationship holds, e.g., spatial, temporal, environmental contexts. The definitions below extend the binary relationship properties, so as to be consistent with possible additional arguments (roles) which specify context in which the binary relationship holds. Thus for example:
[Kevin Keck does not support the specification of these properties for n-ary relationships and hence they have not been included in the latest version of the proposed revised metamodel.]
This page is maintained by Frank Olken. It was last updated on 2005-09-15 at 8:36 PM PDT.