Information technology — Document Schema Definition Language (DSDL) — Part 2: Regular-grammar-based validation — RELAX NG

ISO/IEC 19757-2:2003 specifies RELAX NG, a schema language for XML. A RELAX NG schema specifies a pattern for the structure and content of an XML document. The pattern is specified by using a regular tree grammar. A RELAX NG schema is itself an XML document. ISO/IEC 19757-2:2003 specifies: when an XML document is a correct RELAX NG schema; and when an XML document is valid with respect to a correct RELAX NG schema.

Technologies de l'information — Langage de définition de schéma de documents (DSDL) — Partie 2: Validation de grammaire orientée courante — RELAX NG

General Information

Status: Withdrawn
Publication Date: 27-Nov-2003
Withdrawal Date: 27-Nov-2003

ICS: 35.240.30 - IT applications in information, documentation and publishing

Technical Committee: ISO/IEC JTC 1/SC 34 - Document description and processing languages
Drafting Committee: ISO/IEC JTC 1/SC 34 - Document description and processing languages

Current Stage: 9599 - Withdrawal of International Standard
Start Date: 10-Dec-2008
Completion Date: 14-Feb-2026

Relations

Amended By: ISO/IEC 19757-2:2003/Amd 1:2006 - Information technology — Document Schema Definition Language (DSDL) — Part 2: Regular-grammar-based validation — RELAX NG — Amendment 1: Compact Syntax
Effective Date: 06-Jun-2022

Revised: ISO/IEC 19757-2:2008 - Information technology — Document Schema Definition Language (DSDL) — Part 2: Regular-grammar-based validation — RELAX NG
Effective Date: 14-Aug-2008

Parent: ISO/IEC 19757-2:2003/Amd 1:2006 - Information technology — Document Schema Definition Language (DSDL) — Part 2: Regular-grammar-based validation — RELAX NG — Amendment 1: Compact Syntax
Effective Date: 15-Apr-2008

Buy Documents

ISO/IEC 19757-2:2003 - Information technology -- Document Schema Definition Language (DSDL) - Page 1 preview

ISO/IEC 19757-2:2003 - Information technology -- Document Schema Definition Language (DSDL) - Page 2 preview

ISO/IEC 19757-2:2003 - Information technology -- Document Schema Definition Language (DSDL) - Page 3 preview

Standard

ISO/IEC 19757-2:2003 - Information technology -- Document Schema Definition Language (DSDL)

English language (34 pages)

sale 15% off

Preview

sale 15% off

Preview

Get Certified

Connect with accredited certification bodies for this standard

BSI Group

BSI (British Standards Institution) is the business standards company that helps organizations make excellence a habit.

UKAS United Kingdom Verified

Visit Website

NYCE

Mexican standards and certification body.

EMA Mexico Verified

Visit Website

Frequently Asked Questions

What is ISO/IEC 19757-2:2003?

ISO/IEC 19757-2:2003 is a standard published by the International Organization for Standardization (ISO). Its full title is "Information technology — Document Schema Definition Language (DSDL) — Part 2: Regular-grammar-based validation — RELAX NG". This standard covers: ISO/IEC 19757-2:2003 specifies RELAX NG, a schema language for XML. A RELAX NG schema specifies a pattern for the structure and content of an XML document. The pattern is specified by using a regular tree grammar. A RELAX NG schema is itself an XML document. ISO/IEC 19757-2:2003 specifies: when an XML document is a correct RELAX NG schema; and when an XML document is valid with respect to a correct RELAX NG schema.

What is the scope of ISO/IEC 19757-2:2003?

What ICS categories does ISO/IEC 19757-2:2003 belong to?

ISO/IEC 19757-2:2003 is classified under the following ICS (International Classification for Standards) categories: 35.240.30 - IT applications in information, documentation and publishing. The ICS classification helps identify the subject area and facilitates finding related standards.

What standards are related to ISO/IEC 19757-2:2003?

ISO/IEC 19757-2:2003 has the following relationships with other standards: It is inter standard links to ISO/IEC 19757-2:2003/Amd 1:2006, ISO/IEC 19757-2:2008; is excused to ISO/IEC 19757-2:2003/Amd 1:2006. Understanding these relationships helps ensure you are using the most current and applicable version of the standard.

How can I access ISO/IEC 19757-2:2003?

ISO/IEC 19757-2:2003 is available in PDF format for immediate download after purchase. The document can be added to your cart and obtained through the secure checkout process. Digital delivery ensures instant access to the complete standard document.

Standards Content (Sample)

ISO/IEC 19757-2:2003 - Informa...

INTERNATIONAL ISO/IEC
STANDARD 19757-2
First edition
2003-12-01
Information technology — Document
Schema Definition Language (DSDL) —
Part 2:
Regular-grammar-based validation —
RELAX NG
Technologies de l’information — Langage de définition de schéma de
documents (DSDL) —
Partie 2: Validation de grammaire orientée courante — RELAX NG

Reference number
©
ISO/IEC 2003
PDF disclaimer
This PDF file may contain embedded typefaces. In accordance with Adobe's licensing policy, this file may be printed or viewed but
shall not be edited unless the typefaces which are embedded are licensed to and installed on the computer performing the editing. In
downloading this file, parties accept therein the responsibility of not infringing Adobe's licensing policy. The ISO Central Secretariat
accepts no liability in this area.
Adobe is a trademark of Adobe Systems Incorporated.
Details of the software products used to create this PDF file can be found in the General Info relative to the file; the PDF-creation
parameters were optimized for printing. Every care has been taken to ensure that the file is suitable for use by ISO member bodies. In
the unlikely event that a problem relating to it is found, please inform the Central Secretariat at the address given below.

© ISO/IEC 2003
All rights reserved. Unless otherwise specified, no part of this publication may be reproduced or utilized in any form or by any means,
electronic or mechanical, including photocopying and microfilm, without permission in writing from either ISO at the address below or
ISO's member body in the country of the requester.
ISO copyright office
Case postale 56 • CH-1211 Geneva 20
Tel. + 41 22 749 01 11
Fax + 41 22 749 09 47
E-mail copyright@iso.org
Web www.iso.org
Published in Switzerland
ii © ISO/IEC 2003 – All rights reserved

Contents Page
Foreword. v
Introduction. vi

Scope. 1

2 Normative references. 1
3 Terms and definitions. 1
4 Notation. 4
4.1 EBNF. 4
4.2 Inference rules. 5
4.2.1 Variables. 5
4.2.2
Propositions. 5
4.2.3
Expressions. 6
Data model. 7
Full syntax. 8
7 Simplification. 9
7.1 General. 9
7.2 Annotations. 9
7.3 Whitespace. 9
7.4 datatypeLibrary attribute. 10
7.5 type attribute of value element. 10
7.6
href attribute. 10
7.7
externalRef element. 10
7.8
include element. 10
7.9
name attribute of element and attribute elements. 11
7.10 ns attribute. 11
7.11 QNames. 11
7.12 div element. 11
7.13 Number of child elements. 11
7.14 mixed element. 12
7.15 optional element. 12
7.16
zeroOrMore element. 12
7.17
Constraints. 12
7.18
combine attribute. 13
7.19
grammar element. 13
7.20 define and ref elements. 14
7.21 notAllowed element. 14
7.22 empty element. 14
8 Simple syntax. 14
9 Semantics. 15
9.1 Inference rules. 15
9.2
Name classes. 16
9.3
Patterns. 16
9.3.1
choice pattern. 16
9.3.2
group pattern. 16
9.3.3 empty pattern. 17
9.3.4 text pattern. 17
9.3.5 oneOrMore pattern. 17
9.3.6 interleave pattern. 17
9.3.7 element and attribute pattern. 17
9.3.8 data and value pattern. 18
9.3.9
Built-in datatype library. 19
© ISO/IEC 2003 – All rights reserved iii

9.3.10
list pattern. 19
9.4
Validity. 19
Restrictions. 19
10.1
General. 19

10.2 Prohibited paths. 19
10.2.1 General. 19
10.2.2 attribute pattern. 20

10.2.3 oneOrMore pattern. 20
10.2.4 list pattern. 20
10.2.5 except element in data pattern. 20
10.2.6
start element. 21
10.3
String sequences. 21
10.4
Restrictions on attributes. 23
10.5
Restrictions on interleave. 23
11 Conformance. 23
Annex A (normative) RELAX NG schema for RELAX NG. 24
Annex B (informative) Examples. 30
B.1 Data model. 30
B.2 Full syntax example. 31
B.3 Simple syntax example. 31
B.4
Validation example. 32
Bibliography. 34
iv © ISO/IEC 2003 – All rights reserved

Foreword
ISO (the International Organization for Standardization) and IEC (the International Electrotechnical
Commission) form the specialized system for worldwide standardization. National bodies that are members of
ISO or IEC participate in the development of International Standards through technical committees
established by the respective organization to deal with particular fields of technical activity. ISO and IEC
technical committees collaborate in fields of mutual interest. Other international organizations, governmental
and non-governmental, in liaison with ISO and IEC, also take part in the work. In the field of information
technology, ISO and IEC have established a joint technical committee, ISO/IEC JTC 1.
International Standards are drafted in accordance with the rules given in the ISO/IEC Directives, Part 2.
The main task of the joint technical committee is to prepare International Standards. Draft International
Standards adopted by the joint technical committee are circulated to national bodies for voting. Publication as
an International Standard requires approval by at least 75 % of the national bodies casting a vote.
Attention is drawn to the possibility that some of the elements of this document may be the subject of patent
rights. ISO and IEC shall not be held responsible for identifying any or all such patent rights.
ISO/IEC 19757-2 was prepared by Joint Technical Committee ISO/IEC JTC 1, Information technology,
Subcommittee SC 34, Document description and processing languages.
ISO/IEC 19757 consists of the following parts, under the general title Information technology — Document
Schema Definition Language (DSDL):
— Part 2: Regular-grammar-based validation — RELAX NG
The following parts are under preparation.
— Part 1: Overview
— Part 4: Selection of validation candidates
Rule-based validation — Schematron, Datatypes, Path-based integrity constraints, Character repertoire
validation, Declarative document manipulation, Datatype- and namespace-aware DTDs and Interoperability
framework will form the subjects of future Parts 3, 5, 6, 7, 8, 9 and 10, respectively.

© ISO/IEC 2003 – All rights reserved v

Introduction
The structure of this part of ISO/IEC 19757 is as follows. Clause 5 describes the data model, which is the
abstraction of an XML document used throughout the rest of the document. Clause 6 describes the syntax of a
RELAX NG schema. Clause 7 describes a sequence of transformations that are applied to simplify a RELAX NG

schema, and also specifies additional requirements on a RELAX NG schema. Clause 8 describes the syntax that
results from applying the transformations; this simple syntax is a subset of the full syntax. Clause 9 describes the

semantics of a correct RELAX NG schema that uses the simple syntax; the semantics specify when an element is
valid with respect to a RELAX NG schema. Clause 10 describes requirements that apply to a RELAX NG schema
after it has been transformed into simple form. Finally, Clause 11 describes conformance requirements for RELAX
NG validators.
[1]
This part of ISO/IEC 19757 is based on the RELAX NG Specification . A tutorial for RELAX NG is available
[2]
separately (see the RELAX NG Tutorial ).
vi © ISO/IEC 2003 – All rights reserved

INTERNATIONAL STANDARD ISO/IEC 19757-2:2003(E)

Information technology — Document Schema Definition
Language (DSDL) —
Part 2:
Regular-grammar-based validation — RELAX NG
1 Scope
This part of ISO/IEC 19757 specifies RELAX NG, a schema language for XML. A RELAX NG schema specifies a
pattern for the structure and content of an XML document. The pattern is specified by using a regular tree
grammar. This part of ISO/IEC 19757 establishes requirements for RELAX NG schemas and specifies when an
XML document matches the pattern specified by a RELAX NG schema.
2 Normative references
The following referenced documents are indispensable for the application of this document. For dated references,
only the edition cited applies. For undated references, the latest edition of the referenced document (including any
amendments) applies.
NOTE Each of the following documents has a unique identifier that is used to cite the document in the text. The unique
identifier consists of the part of the reference up to the first comma.
W3C XML, Extensible Markup Language (XML) 1.0 (Second Edition), W3C Recommendation, 6 October 2000,
available at
W3C XML-Names, Namespaces in XML, W3C Recommendation, 14 January 1999, available at

W3C XLink, XML Linking Language (XLink) Version 1.0, W3C Recommendation, 27 June 2001, available at

W3C XML-Infoset, XML Information Set, W3C Recommendation, 24 October 2001, available at

IETF RFC 2045, Multipurpose Internet Mail Extensions (MIME) Part One: Format of Internet Message Bodies,
Internet Standards Track Specification, November 1996, available at
IETF RFC 2046, Multipurpose Internet Mail Extensions (MIME) Part Two: Media Types, Internet Standards Track
Specification, November 1996, available at
IETF RFC 2396, Uniform Resource Identifiers (URI): Generic Syntax, Internet Standards Track Specification,
August 1998, available at
IETF RFC 2732, Format for Literal IPv6 Addresses in URL's, Internet Standards Track Specification, December
1999, available at
IETF RFC 3023, XML Media Types, Internet Standards Track Specification, August 1998, available at

3 Terms and definitions
For the purposes of this document, the following terms and definitions apply.
3.1
resource
something with identity, potentially addressable by a URI
© ISO/IEC 2003 – All rights reserved 1

3.2
URI
compact string of characters that uses the syntax defined in IETF RFC 2396 to identify an abstract or physical
resource
3.3
URI reference
URI or relative URI and optional fragment identifier
3.4
relative URI
form of URI reference that can be resolved with respect to a base URI to produce another URI
3.5
base URI
URI used to resolve relative URIs
3.6
fragment identifier
additional information in a URI reference used by a user agent after the retrieval action on a URI has been
successfully performed
3.7
instance
XML document that is being validated with respect to a RELAX NG schema
3.8
space character
character with the code value #x20
3.9
whitespace character
character with the code value #x20, #x9, #xA or #xD
3.10
name
pair of a URI and a local name
3.11
namespace URI
URI that is part of a name
3.12
local name
NCName that is part of a name
3.13
NCName
string that matches the NCName production of W3C XML-Names
3.14
name class
part of a schema that can be matched against a name
3.15
pattern
part of a schema that can be matched against a set of attributes and a sequence of elements and strings
2 © ISO/IEC 2003 – All rights reserved

3.16
foreign attribute
attribute with a name whose namespace URI is neither the empty string nor the RELAX NG namespace URI
3.17
foreign element
an element with a name whose namespace URI is not the RELAX NG namespace URI
3.18
full syntax
syntax of a RELAX NG grammar before simplification
3.19
simple syntax
syntax of a RELAX NG grammar after simplification
3.20
simplification
transformation of a RELAX NG schema in the full syntax to a schema in the simple syntax
3.21
datatype library
mapping from local names to datatypes
NOTE a datatype library is identified by a URI
3.22
datatype
set of strings together with an equivalence relation on that set
3.23
axiom
proposition that is provable unconditionally
3.24
inference rule
rule consisting of one or more positive or negative antecendents and exactly one consequent, which makes the
consequent provable if all the positive antecedents are provable and none of the negative antecendents is
provable
3.25
valid with respect to a schema
member of the set of XML documents described by the schema
3.26
schema
specification of a set of XML documents
3.27
grammar
start pattern together with a mapping from NCNames to patterns
3.28
correct schema
schema that satisfies all the requirements of this part of ISO/IEC 19757
© ISO/IEC 2003 – All rights reserved 3

3.29
validator
software module that determine whether a schema is correct and whether an instance is valid with respect to a
schema
3.30
path
list of NCNames separated by / or //
3.31
infoset
an abstraction of an XML document defined by W3C XML-Infoset
3.32
information item
constituent of an information set
3.33
data model
abstract representation of an XML document defined by this part of ISO/IEC 19757
3.34
XML document
string that is a well-formed XML document as defined in W3C XML
3.35
EBNF
Extended BNF
notation used to described context-free grammars
3.36
weak matching
kind of matching specified in detail in 9.3.7
3.37
in-scope grammar
nearest ancestor grammar element
3.38
content-type
one of the three values empty, complex, or simple
3.39
mixed sequence
sequence that may contain both elements and strings
4 Notation
4.1 EBNF
This part of ISO/IEC 19757 uses EBNF notation to describe the full syntax and the simple syntax of RELAX NG. A
description of a grammar in EBNF consists of one or more production rules. Each production rule consists of the
name of a non-terminal, followed by ::=, followed by a list of alternatives separated by |. Within an alternative,
italic type is used to reference a non-terminal, concatenation indicates sequencing, [] indicates optionality, +
indicates repetition one or more times and * indicates repetition zero or more times; other characters in normal
type stand for themselves.
4 © ISO/IEC 2003 – All rights reserved

4.2 Inference rules
4.2.1 Variables
The symbol used for a variable indicates the variable's range as follows:
— n ranges over names
— nc ranges over name classes
— ln ranges over local names; a local name is a string that matches the NCName production of W3C XML-
Names, that is, a name with no colons
— u ranges over URIs
— cx ranges over contexts (as defined in Clause 5)
— a ranges over sets of attributes; a set with a single member is considered the same as that member
— m ranges over sequences of elements and strings; a sequence with a single member is considered the same
as that member; the sequences ranged over by m may contain consecutive strings and may contain strings
that are empty
NOTE There are sequences ranged over by m that cannot occur as the children of an element.
— p ranges over patterns (elements matching the pattern production)
— s ranges over strings
— ws ranges over the empty sequence and strings that consist entirely of whitespace
— params ranges over sequences of parameters
— e ranges over elements
— ct ranges over content-types
4.2.2 Propositions
The following notation is used for propositions:
— n in nc means that name n is a member of name class nc
— cx ⊦ a; m =~ p means that with respect to context cx, the attributes a and the sequence of elements and
strings m matches the pattern p
— disjoint(a , a ) means that there is no name that is the name of both an attribute in a and of an attribute in a
1 2 1 2
— m interleaves m ; m means that m is an interleaving of m and m
1 2 3 1 2 3
— cx ⊦ a; m =~ p means that with respect to context cx, the attributes a and the sequence of elements and
weak
strings m weakly matches the pattern p
— okAsChildren(m) means that the mixed sequence m can occur as the children of an element: it does not
contain any member that is an empty string, nor does it contain two consecutive members that are both
strings
© ISO/IEC 2003 – All rights reserved 5

— deref(ln) = nc p means that the grammar contains nc
p
— datatypeAllows(u, ln, params, s, cx) means that in the datatype library identified by URI u, the string s
interpreted with context cx is a legal value of datatype ln with parameters params
— datatypeEqual(u, ln, s , cx , s , cx ) means that in the datatype library identified by URI u, string s
1 1 2 2 1
interpreted with context cx represents the same value of the datatype ln as the string s interpreted in the
1 2
context of cx
— s = s means that s and s are identical
1 2 1 2
— valid(e) means that the element e is valid with respect to the grammar
— start() = p means that the grammar contains p
— groupable(ct , ct ) means that the content-types ct and ct are groupable
1 2 1 2
— p : ct means that pattern p has content-type ct
c
— incorrectSchema() means that the schema is incorrect
4.2.3 Expressions
The following notation is used for expressions in propositions:
— name( u, ln ) returns a name with URI u and local name ln
— m , m returns the concatenation of the sequences m and m
1 2 1 2
— a + a returns the union of a and a
1 2 1 2
— ( ) returns an empty sequence
— { } returns an empty set
— "" returns an empty string
— attribute( n, s ) returns an attribute with name n and value s
— element( n, cx, a, m ) returns an element with name n, context cx, attributes a and mixed sequence m as
children
— max( ct , ct ) returns the maximum of ct and ct where the content-types in increasing order are empty( ),
1 2 1 2
complex( ), simple( )
— normalizeWhiteSpace( s ) returns the string s, with leading and trailing whitespace characters removed, and
with each other maximal sequence of whitespace characters replaced by a single space character
— split( s ) returns a sequence of strings one for each whitespace delimited token of s; each string in the
returned sequence will be non-empty and will not contain any whitespace
— context( u, cx ) returns a context which is the same as cx except that the default namespace is u; if u is the
empty string, then there is no default namespace in the constructed context
— empty( ) returns the empty content-type
— complex( ) returns the complex content-type
6 © ISO/IEC 2003 – All rights reserved

— simple( ) returns the simple content-type
— [cx] within the start-tag of a pattern refers to the context of the pattern element
5 Data model
RELAX NG deals with XML documents representing both schemas and instances through an abstract data
model. XML documents representing schemas and instances shall be well-formed in conformance with W3C XML
and shall conform to the constraints of W3C XML-Names.
An XML document is represented by an element. An element consists of
— a name
— a context
— a set of attributes
— an ordered sequence of zero or more children; each child is either an element or a non-empty string; the
sequence never contains two consecutive strings
A name consists of
— a string representing the namespace URI; the empty string has special significance, representing the
absence of any namespace
— a string representing the local name; this string matches the NCName production of W3C XML-Names
A context consists of
— a base URI
— a namespace map; this maps prefixes to namespace URIs, and also may specify a default namespace URI
(as declared by the xmlns attribute)
An attribute consists of
— a name
— a string representing the value
A string consists of a sequence of zero or more characters, where a character is as defined in W3C XML.
The element for an XML document is constructed from the infoset (see W3C XML-Infoset) of the XML document
as follows. The notation [x] refers to the value of the x property of an information item. An element is constructed
from a document information item by constructing an element from the [document element]. An element is
constructed from an element information item by constructing the name from the [namespace name] and [local
name], the context from the [base URI] and [in-scope namespaces], the attributes from the [attributes], and the
children from the [children]. The attributes of an element are constructed from the unordered set of attribute
information items by constructing an attribute for each attribute information item. The children of an element are
constructed from the list of child information items first by removing information items other than element
information items and character information items, and then by constructing an element for each element
information item in the list and a string for each maximal sequence of character information items. An attribute is
constructed from an attribute information item by constructing the name from the [namespace name] and [local
name], and the value from the [normalized value]. When constructing the name of an element or attribute from
the [namespace name] and [local name], if the [namespace name] property is not present, then the name is
constructed from an empty string and the [local name]. A string is constructed from a sequence of character
information items by constructing a character from the [character code] of each character information item.
© ISO/IEC 2003 – All rights reserved 7

It is possible for there to be multiple distinct infosets for a single XML document. This is because XML parsers are
not required to process all DTD declarations or expand all external parsed general entities. Amongst these
multiple infosets, there is exactly one infoset for which [all declarations processed] is true and which does not
contain any unexpanded entity reference information items. This is the infoset that is the basis for defining the
RELAX NG data model.
6 Full syntax
The following grammar in EBNF notation summarizes the syntax of RELAX NG. Although the notation is based on
the XML representation of an RELAX NG schema as a sequence of characters, the grammar operates at the data
model level. For example, although the syntax uses , an instance or schema can use instead,
because they both represent the same element at the data model level. All elements shown in the grammar are
qualified with the namespace URI:
http://relaxng.org/ns/structure/1.0
The symbols QName and NCName are defined in W3C XML-Names. The anyURI symbol indicates a string that,
after escaping of disallowed values as described in Section 5.4 of W3C XLink, is a URI reference as defined in
IETF RFC 2396 (as modified by IETF RFC 2732). The symbol string matches any string.
In addition to the attributes shown explicitly, any element can have an ns attribute and any element can have a
datatypeLibrary attribute. The ns attribute can have any value. The value of the datatypeLibrary attribute shall
match the anyURI symbol as described in the previous paragraph; in addition, it shall not use the relative form of
URI reference and shall not have a fragment identifier; as an exception to this, the value may be the empty string.
Any element can also have foreign attributes in addition to the attributes shown in the grammar. A foreign
attribute is an attribute with a name whose namespace URI is neither the empty string nor the RELAX NG
namespace URI. Any element that cannot have string children (that is, any element other than value, param and
name) may have foreign child elements in addition to the child elements shown in the grammar. A foreign
element is an element with a name whose namespace URI is not the RELAX NG namespace URI. There are no
constraints on the relative position of foreign child elements with respect to other child elements.
Any element can also have as children strings that consist entirely of whitespace characters, where a whitespace
character is one of #x20, #x9, #xD or #xA. There are no constraints on the relative position of whitespace string
children with respect to child elements.
Leading and trailing whitespace is allowed for value of each name, type and combine attribute and for the content
of each name element.
pattern  ::=
pattern+
| nameClass pattern+
| [pattern]
| nameClass [pattern]
| pattern+
| pattern+
| pattern+
| pattern+
| pattern+
| pattern+
| pattern+
| pattern+
|
|
|
|
| string
| param* [exceptPattern]
|
8 © ISO/IEC 2003 – All rights reserved

|
| grammarContent*
param  ::=
string
exceptPattern  ::=
pattern+
grammarContent  ::=
start
| define
|

grammarContent*

| includeContent*
includeContent ::=
start
| define
|

includeContent*

start  ::=
pattern
define  ::=
pattern+
method  ::=
choice
| interleave
nameClass  ::=
QName
| [exceptNameClass]
| [exceptNameClass]
| nameClass+
exceptNameClass  ::=
nameClass+
7 Simplification
7.1 General
The full syntax given in the previous clause is transformed into a simpler syntax by applying the following
transformation rules in order. The effect shall be as if each rule was applied to all elements in the schema before
the next rule is applied. A transformation rule may also specify constraints that shall be satisfied by a correct
schema. The transformation rules are applied at the data model level. Before the transformations are applied, the
schema is parsed into an element in the data model.
7.2 Annotations
Foreign attributes and elements are removed.
NOTE It is safe to remove xml:base attributes at this stage because xml:base attributes are used in determining the [base
URI] of an element information item, which is in turn used to construct the base URI of the context of an element. Thus, after a
document has been parsed into an element in the data model, xml:base attributes can be discarded.
7.3 Whitespace
For each element other than value and param, each child that is a string containing only whitespace characters is
removed.
Leading and trailing whitespace characters are removed from the value of each name, type and combine attribute
and from the content of each name element.
© ISO/IEC 2003 – All rights reserved 9

7.4 datatypeLibrary attribute
The value of each datatypeLibary attribute is transformed by escaping disallowed characters as specified in
Section 5.4 of W3C XLink.
For any data or value element that does not have a datatypeLibrary attribute, a datatypeLibrary attribute is added.
The value of the added datatypeLibrary attribute is the value of the datatypeLibrary attribute of the nearest
ancestor element that has a datatypeLibrary attribute, or the empty string if there is no such ancestor. Then, any
datatypeLibrary attribute that is on an element other than data or value is removed.
7.5 type attribute of value element
For any value element that does not have a type attribute, a type attribute is added with a value of token and the
value of the datatypeLibrary attribute is changed to the empty string.
7.6 href attribute
The value of the href attribute on an externalRef or include element is first transformed by escaping disallowed
characters as specified in Section 5.4 of W3C XLink. The URI reference is then resolved into an absolute form as
described in Section 5.2 of IETF RFC 2396 using the base URI from the context of the element that bears the href
attribute.
The value of the href attribute is used to construct an element (as specified in Clause 5). This shall be done as
follows. The URI reference consists of the URI itself and an optional fragment identifier. The resource identified by
the URI is retrieved. The result is a MIME entity (see IETF RFC 2045): a sequence of bytes labeled with a MIME
media type (see IETF RFC 2046). The media type determines how an element is constructed from the MIME
entity and optional fragment identifier. When the media type is application/xml or text/xml, the MIME entity shall
be parsed as an XML document in accordance with the applicable RFC (at the term of writing IETF RFC 3023)
and an element constructed from the result of the parse as specified in Clause 5. In particular, the charset
parameter shall be handled as specified by the RFC. This specification does not define the handling of media
types other than application/xml and text/xml. The href attribute shall not include a fragment identifier unless the
registration of the media type of the resource identified by the attribute defines the interpretation of fragment
identifiers for that media type.
NOTE IETF RFC 3023 does not define the interpretation of fragment identifiers for application/xml or text/xml.
7.7 externalRef element
An externalRef element is transformed as follows. An element is constructed using the URI reference that is the
value of href attribute as specified in 7.6. This element shall match the syntax for pattern. The element is
transformed by recursively applying the rules from this subclauses and from previous subclauses of this clause.
This shall not result in a loop. In other words, the transformation of the referenced element shall not require the
dereferencing of an externalRef element with an href attribute with the same value.
Any ns attribute on the externalRef element is transferred to the referenced element if the referenced element
does not already have an ns attribute. The externalRef element is then replaced by the referenced element.
7.8 include element
An include element is transformed as follows. An element is constructed using the URI reference that is the value
of href attribute as specified in 7.6. This element shall be a grammar element, matching the syntax for grammar.
This grammar element is transformed by recursively applying the rules from this subclause and from previous
subclauses of this clause. This shall not result in a loop. In other words, the transformation of the grammar
element shall not require the dereferencing of an include element with an href attribute with the same value.
Define the components of an element to be the children of the element together with the components of any div
child elements. If the include element has a start component, then the grammar element shall have at least one
start component; it is then transformed by removing all start components. If the include element has a define
10 © ISO/IEC 2003 – All rights reserved

component, then the grammar element shall have at least one define component with the same name; it is then
transformed by removing all such define components.
The include element is transformed into a div element. The attributes of the div element are the attributes of the
include element other than the href attribute. The children of the div element are the grammar element (after the
removal of the start and define components described by the preceding paragraph) followed by the children of the
include element. The grammar element is then renamed to div.
7.9 name attribute of element and attribute elements
The name attribute on an element or attribute element is transformed into a name child element.
If an attribute element has a name attribute but no ns attribute, then an ns="" attribute is added to the name child
element.
7.10 ns attribute
For any name, nsName or value element that does not have an ns attribute, an ns attribute is added. The value
of the added ns attribute is the value of the ns attribute of the nearest ancestor element that has an ns attribute,
or the empty string if there is no such ancestor. Then, any ns attribute that is on an element other than name,
nsName or value is removed.
NOTE 1 The value of the ns attribute is not transformed either by escaping disallowed characters, or in any other way,
because the value of the ns attribute is compared against namespace URIs in the instance, which are not subject to any
transformation.
NOTE 2 Since include and externalRef elements are resolved after datatypeLibrary attributes are added but before ns
attributes are added, ns attributes are inherited into external schemas but datatypeLibrary attributes are not.
7.11 QNames
For any name element containing a prefix, the prefix is removed and an ns attribute is added replacing any
existing ns attribute. The value of the added ns attribute is the value to which the namespace map of the context
of the name element maps the prefix. The context shall have a mapping for the prefix.
7.12 div element
Each div element is replaced by its children.
7.13 Number of child elements
A define, oneOrMore, zeroOrMore, optional, list or mixed element is transformed so that it has exactly one child
element. If it has more than one child element, then its child elements are wrapped in a group element.
An element element is transformed so that it has exactly two child elements, the first being a name class and the
second being a pattern. If it has more than two child elements, then the child elements other than the first are
wrapped in a group element.
A except element is transformed so that it has exactly one child element. If it has more than one child element,
then its child elements are wrapped in a choice element.
If an attribute element has only one child element (a name class), then a text element is added.
A choice, group or interleave element is transformed so that it has exactly two child elements. If it has one child
element, then it is replaced by its child element. If it has more than two child elements, then the first two child
elements are combined into a new element with the same name as the parent element and with the firs
...

Questions, Comments and Discussion

Ask us and Technical Secretary will try to provide an answer. You can facilitate discussion about the standard in here.

Loading comments...