RX - A Prescription for RDF in XML

Jan 17, 2005

Copyright (C) 2005 Josh Sled

jsled {at} asynchronous {dot} org, <http://asynchronous.org/jsled>

Abstract

This document seeks to define an XML representation for RDF graphs which has a low impedance-match between commonly-created RDF graphs and common XML data formats in use on the web. Specifically, this format differs from the Recommended RDF/XML syntax in the following ways:

  1. XML elements correspond to RDF Properties.
  2. parseType="Resource" is the default; elements generally relate blank nodes to other blank nodes.
  3. Resource typing is optional and not expected; striping is eliminated.
  4. Options for serialization are explicitly minimized to encourage consistency for consumers. Specifically: property values in attributes are explicitly disallowed.

Issues

Status

Table of Contents

Goals

A goal of RX is to make the simple simple and some of the complex possible. Conventional graphs -- often hand-written during the initial stages of development and usage -- must be clear and easy to express, but the serialization must retain sufficient capability to express many common yet complex RDF graphs. RX specifically does not seek to allow the expression of all legal RDF graphs.

@@ fixme -- why not? what does that imply?

Summary

With noted exceptions as defined in Framing the Document, XML element nodes in the document representing RDF Property relations between their container subject and child object. Unless specifically named using the is:about attribute, all elements relate blank nodes in an RDF graph.:

<eg:doc is:about="http://www.w3.org/2004/lambda/Sites/index.html">
  <eg:creator>
    <eg:name>Joe Lambda</eg:name>
    <eg:homepage is:about="http://www.w3.org/2004/lambda/Sites/index.html" />
  </eg:creator>
<eg:doc>

...

<http://www.w3.org/2004/lambda/Sites/index.html>
    eg:creator [ eg:name "Joe Lambda"
               ; eg:homepage <http://www.w3.org/2004/lambda/Sites/index.html> ] .

The Literal/Resource-nature of the object of statements involving a particular XML element / RDF Property is decided via the following rule: if an element contains further XML element, then the object is a probably-blank Resource, which is the subject of statements represented by the contained XML elements. If the content of an XML element is simply CDATA, then the object of the statement is a Literal with the CDATA as it's value.

The attribute is:about is used to denote both subject- and object-resources. Simply, the is:about attribute is always an object-resource, but if the element on which an is:about attribute lives contains children, then the URI is also the subject of the statements as given by the contained element-properties.

Introduction via examples contrasted with RDF/XML

Because of the use and familiarity of RDF/XML, it seems easiest to convey an introduction to RX in terms of canonical RDF/XML examples.

RDF/XML Example 1:

<rdf:Description>
  <ex:editor>
    <rdf:Description>
      <ex:homePage>
        <rdf:Description>
        </rdf:Description>
      </ex:homePage>
    </rdf:Description>
  </ex:editor>
</rdf:Description>

becomes (example1.rx -> example1.nt):

<is:aDescription>
  <ex:editor>
    <ex:homePage />
  </ex:editor>
</is:aDescription>

RDF/XML Example 2:

<rdf:Description rdf:about="http://www.w3.org/TR/rdf-syntax-grammar">
  <ex:editor>
    <rdf:Description>
      <ex:homePage>
        <rdf:Description rdf:about="http://purl.org/net/dajobe/">
        </rdf:Description>
      </ex:homePage>
    </rdf:Description>
  </ex:editor>
</rdf:Description>

becomes (example2.rx -> example2.nt):

<is:aDescription is:about="http://www.w3.org/TR/rdf-syntax-grammar">
  <ex:editor>
    <ex:homePage is:about="http://purl.org/net/dajobe/" />
  </ex:editor>
</is:aDescription>

Multiple Properties as Elements

RDF/XML Example 4:

<rdf:Description rdf:about="http://www.w3.org/TR/rdf-syntax-grammar">
  <ex:editor>
    <rdf:Description>
      <ex:homePage>
        <rdf:Description rdf:about="http://purl.org/net/dajobe/">
        </rdf:Description>
      </ex:homePage>
      <ex:fullName>Dave Beckett</ex:fullName>
    </rdf:Description>
  </ex:editor>
  <dc:title>RDF/XML Syntax Specification (Revised)</dc:title>
</rdf:Description>

becomes (example4.rx -> example4.nt):

<is:aDescription is:about="http://www.w3.org/TR/rdf-syntax-grammar">
  <ex:editor>
    <ex:homePage is:about="http://purl.org/net/dajobe/" />
    <ex:fullName>Dave Beckett</ex:fullName>
  </ex:editor>
  <dc:title>RDF/XML Syntax Specification (Revised)</dc:title>
</is:aDescription>

Framing the Document

We distinguish two types of users of the RX format: "pure" and "application".

For "pure" users, the root element will be the is:stuff element, which will then contain any number of is:aDescription elements, each representing some top-level piece of data for the contained graph.

Example (pure-framing.rx -> pure-framing.nt):

<?xml version="1.0" encoding="utf-8"?>
<is:stuff xmlns:is="http://asynchronous.org/rx/ns/2005/01/is#"
          xmlns:ex="http://example.org/stuff/1.0/"
          xmlns:dc="http://purl.org/dc/elements/1.1/"
          >
    <is:aDescription is:about="http://www.w3.org/TR/rdf-syntax-grammar"
                     is:a="http://example.org/stuff/1.0/doc">
      <ex:editor>
        <ex:homePage is:about="http://purl.org/net/dajobe/" />
        <ex:fullName>Dave Beckett</ex:fullName>
      </ex:editor>
      <dc:title>RDF/XML Syntax Specification (Revised)</dc:title>
    </is:aDescription>
</is:stuff>

For "application" users, the document-root element is in an application-specific element, and will be expressed in RDF as the type of an outermost node. Yes, this is in stark contrast every other use of XML elements here, but it's the most reasonable one.

The XML elements within this root element will be RDF property elements, as expected. The subject of these relations will either be to a blank node or to the subject-resource given by an is:about attribute on the outermost relation, as normal.

Example (app-framing.rx -> app-framing.nt):

<?xml version="1.0" encoding="utf-8"?>
<ex:doc xmlns:is="http://asynchronous.org/rx/ns/2005/01/is#"
        xmlns:ex="http://example.org/stuff/1.0/"
        xmlns:dc="http://purl.org/dc/elements/1.1/"
        is:about="http://www.w3.org/TR/rdf-syntax-grammar">
  <ex:editor>
    <ex:homePage is:about="http://purl.org/net/dajobe/" />
    <ex:fullName>Dave Beckett</ex:fullName>
  </ex:editor>
  <dc:title>RDF/XML Syntax Specification (Revised)</dc:title>
</ex:doc>

Applications are encouraged to use 'application'-style framing.

Document typing

The namespace for RX is:

http://asynchronous.org/rx/ns/2005/01/is#

The expected namespace prefix is is, though there is no guarantee or requirement that that be true, as per the Namespaces specification; of course, no matter what namespace prefix is (or is not) used, it is always pronounced "is".

The expected file extension for RX files is .rx.

@@ fixme media and content types.

Specifying named subjects

is:about="...uri..." is used to specify the subject of statements in the document fragment.

@@ fixme -- expand, constraints, examples

Languages using xml:lang

@@fixme add -- this should be identical to how RDF/XML handles xml:lang.

Data-typed Literals

To specify the datatype by which to interpret a literal, the is:ofDatatype="...datatype uri..." attribute is used.

Example (datatypes.rx -> datatypes.nt):

<?xml version="1.0" encoding="utf-8"?>
<is:stuff xmlns:is="http://asynchronous.org/rx/ns/2005/01/is#"
          xmlns:ex="http://example.org/stuff/1.0/"
          xmlns:dc="http://purl.org/dc/elements/1.1/"
          xmlns:xsd="http://www.w3.org/2001/XMLSchema#"
          >
  <is:aDescription is:a="http://example.org/stuff/1.0/measurement">
    <dc:date is:ofDatatype="http://www.w3.org/2001/XMLSchema#dateTime">2005-01-16T13:38:00-05:00</dc:date>
    <ex:height is:ofDatatype="http://www.w3.org/2001/XMLSchema#float">1.759</ex:height>
    <ex:weight is:ofDatatype="http://www.w3.org/2001/XMLSchema#float">78.42</ex:weight>
  </is:aDescription>
</is:stuff>

Though it is not strictly required, the general desire of this -- and most -- formats is to present sufficient information for the parsing engine to convert a sequence of bytes into the most reasonable domain objects. As such, users are strongly encouraged to represent as much data typing as is feasible.

@@question -- short forms for the XSD datatypes?

Literal XML content

In many cases we may wish to allow the object of a property to contain XML markup which is not intended to be interpreted by the rules contained herein. We allow a boolean property named is:literalXml="true" to be specified to indicate to the interpreter that the contained context is to be treated as literal XML content. is:literalXml="false" could be specified, but it is silly; don't do that. The literal content will not be whitespace-stripped.

Example (literal-xml.rx -> literal-xml.nt):

<?xml version="1.0" encoding="utf-8"?>
<is:stuff xmlns:is="http://asynchronous.org/rx/ns/2005/01/is#"
          xmlns:dc="http://purl.org/dc/elements/1.1/"
          xmlns:ex="http://example.org/stuff/1.0/">
  <is:aDescription>
    <ex:recipe>
      <ex:tastingNotes>
        <dc:date is:ofDatatype="http://www.w3.org/2001/XMLSchema#dateTime">2005-01-16T13:42:00-05:00</dc:date>
        <dc:author>John</dc:author>
        <ex:body is:literalXml="true">
          <p>this is contained, <em>literal</em> XML data.</p>
        </ex:body>
      </ex:tastingNotes>
    </ex:recipe>
  </is:aDescription>
</is:stuff>

Mixed content

Mixed content is explicitly not supported in this scheme. Parsers encountering mixed content MUST report error.

Naming nodes for intra-document reference

The is:about attribute is also used to name and reference in-document nodes, so long as the first character of the is:about content is the character '#'.:

is:about="#fragmentId"

The object of the element containing the identifier is being identified by use of is:about="#fragmentId".

RDF/XML Example 11:

<?xml version="1.0"?>
<rdf:RDF xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#"
         xmlns:dc="http://purl.org/dc/elements/1.1/"
         xmlns:ex="http://example.org/stuff/1.0/">
  <rdf:Description rdf:about="http://www.w3.org/TR/rdf-syntax-grammar"
                   dc:title="RDF/XML Syntax Specification (Revised)">
    <ex:editor rdf:nodeID="abc"/>
  </rdf:Description>

  <rdf:Description rdf:nodeID="abc"
                   ex:fullName="Dave Beckett">
    <ex:homePage rdf:resource="http://purl.org/net/dajobe/"/>
  </rdf:Description>
</rdf:RDF>

becomes (example11.rx -> example11.nt):

<?xml version="1.0"?>
<is:stuff xmlns:is="http://purl.org/rdf/is/1.0"
          xmlns:dc="http://purl.org/dc/elements/1.1/"
          xmlns:ex="http://example.org/stuff/1.0/">
  <is:aDescription is:about="http://www.w3.org/TR/rdf-syntax-grammar">
    <dc:title>RDF/XML Syntax Specification (Revised)</dc:title>
    <ex:editor is:about="#editor" />
  </is:aDescription>
  
  <is:aDescription is:about="#editor">
    <ex:fullName>Dave Beckett</ex:fullName>
    <ex:homePage is:about="http://purl.org/net/dajobe/"/>
  </is:aDescription>
</is:stuff>

Expressing the type of a node

We use the is:a attribute to define the rdf:type of the object of the statement of which the property is part.

Example (typed.rx -> typed.nt):

<?xml version="1.0" encoding="utf-8"?>
<is:stuff xmlns:is="http://asynchronous.org/rx/ns/2005/01/is#"
          xmlns:ex="http://example.org/stuff/1.0/"
          xmlns:dc="http://purl.org/dc/elements/1.1/"
          xmlns:xsd="http://www.w3.org/2001/XMLSchema#"
          xmlns:foaf="http://xmlns.com/foaf/0.1/"
          >
  <is:aDescription is:a="http://example.org/stuff/1.0/example">
    <dc:date is:ofDatatype="http://www.w3.org/2001/XMLSchema#dateTime">2005-01-16T13:38:00-05:00</dc:date>
    <dc:creator is:a="http://xmlns.com/foaf/0.1/Person">
      <foaf:nick>jsled</foaf:nick>
    </dc:creator>
  </is:aDescription>
</is:stuff>

Shortened names, &c.

Interpret xml:base appropriately.

Collections

We primarily use XML's hierarchy to represent the "right-hand-side" of RDF triples, but traditional XML usages have also used hierarchy and element-repetition to represent collections of data. The RX format disambiguates these cases with the is:aListOf attribute. The value of the attribute is the URL of the name of the subordinate elements which should be interpreted as a list, rather than as repeated property-plus-object triples around the common subject.:

<ex:container is:aListOf="http://example.org/stuff/1.0/items">
  <ex:items>list element 1</ex:items>
  <ex:items>list element 2</ex:items>
</ex:container>

Since the format makes explicit the contained elements which comprise the list, hierarchy can also be used to associate data with the list itself.

Example (list.rx -> list.nt):

<?xml version="1.0" encoding="utf-8"?>
<is:stuff xmlns:is="http://asynchronous.org/rx/ns/2005/01/is#"
          xmlns:ex="http://example.org/stuff/1.0/"
          xmlns:dc="http://purl.org/dc/elements/1.1/"
          xmlns:foaf="http://xmlns.com/foaf/0.1/"
          xmlns:xsd="http://www.w3.org/2001/XMLSchema#"
          >
  <is:aDescription is:about="http://aynchronous.org/tmp/2005/01/16/rx/examples/list">
    <ex:container is:aListOf="http://example.org/stuff/1.0/items">
      <dc:title>My favorite things</dc:title>
      <dc:author is:a="http://xmlns.com/foaf/0.1/Person">
        <foaf:nick>jsled</foaf:nick>
      </dc:author>
      <ex:items is:ofDatatype="http://www.w3.org/2001/XMLSchema#int">42</ex:items>
      <ex:items>Roses on rooftops.</ex:items>
      <ex:items>Rainbows on kittens.</ex:items>
      <ex:items is:about="http://asynchronous.org/blog" />
    </ex:container>
  </is:aDescription>
</is:stuff>

Lists result in triple generation as per RDF Collections.

Brief Reference Summary

is:stuff A root element for RX documents if the application does not wish to provide it's own.
is:aDescription A container for subjects under is:stuff.
is:about Names resources; equivalent to rdf:about and rdf:resource.
is:a Specifies the rdf:type of the object.
is:ofDatatype Specifies the datatype for the interpretation of the Literal.
is:aListOf Specifies the child elements which should be treated as an RDF Collection.
is:literalXml Boolean, specifies if the contained content should be parsed or not.

Notes

References

More examples

An example using is:about at multiple levels in the graph (deeply-subjected.rx -> deeply-subjected.nt):

<?xml version="1.0" encoding="utf-8"?>
<ex:doc xmlns:is="http://asynchronous.org/rx/ns/2005/01/is#"
          xmlns:ex="http://example.org/stuff/1.0/"
          xmlns:dc="http://purl.org/dc/elements/1.1/"
          xmlns:foaf="http://xmlns.com/foaf/0.1/"
          xmlns:xsd="http://www.w3.org/2001/XMLSchema#"
          is:about="http://asynchronous.org/rx/index.rst"
          >
  <dc:title>RX home page (reStructuredText source)</dc:title>
  <ex:convertsInto is:about="http://asynchronous.org/rx/index.html">
    <dc:title>RX home page</dc:title>
    <ex:synonomousWith is:about="http://asynchronous.org/rx/">
      <ex:isPreferred is:ofDatatype="http://www.w3.org/2001/XMLSchema#boolean">true</ex:isPreferred>
    </ex:synonomousWith>
  </ex:convertsInto>
  <dc:author is:a="http://xmlns.com/foaf/0.1/Person"
             is:about="http://asynchronous.org/jsled/foaf">
    <foaf:nick>jsled</foaf:nick>
    <foaf:knows is:about="http://waxy.org/">
      <foaf:nick>doink?</foaf:nick>
    </foaf:knows>
  </dc:author>
</ex:doc>