Rx -- A Prescription for RDF in XML

RX defines an XML representation for RDF graphs which has a low impedance-match between both commonly-created RDF graphs and common XML data formats in use on the web. Specifically, this format differs from the Recommended RDF/XML syntax in the following ways:

  1. XML elements correspond to RDF Properties.
  2. parseType="Resource" is the default; elements generally relate blank nodes to other blank nodes.
  3. Resource typing is optional and not expected; striping is eliminated.
  4. Options for serialization are explicitly minimized to encourage consistency for consumers. + Specifically: property values in attributes are explicitly disallowed.

The desire is to have an XML serialization for RDF that's more immediately intuitive for potential users approaching from an XML-only perspective. As well, (with implementation) it would provide a viable option for structured-data formats which wish to be dually-interpreted as XML and RDF.

Example

This is an example of RX showing most of it's syntactic features:

<?xml version="1.0" encoding="utf-8"?>
<is:stuff xmlns:is="http://asynchronous.org/rx/ns/2005/01/is#"
          xmlns:ex="http://example.org/stuff/1.0/"
          xmlns:dc="http://purl.org/dc/elements/1.1/"
          xmlns:foaf="http://xmlns.com/foaf/0.1/"
          xmlns:xsd="http://www.w3.org/2001/XMLSchema#"
          >
  <is:aDescription is:about="http://aynchronous.org/tmp/2005/01/16/rx/examples/list">
    <ex:container is:aListOf="http://example.org/stuff/1.0/items">
      <dc:title>My favorite things</dc:title>
      <dc:author is:a="http://xmlns.com/foaf/0.1/Person">
        <foaf:nick>jsled</foaf:nick>
      </dc:author>
      <ex:items is:ofDatatype="http://www.w3.org/2001/XMLSchema#int">42</ex:items>
      <ex:items>Roses on rooftops.</ex:items>
      <ex:items>Rainbows on kittens.</ex:items>
      <ex:items is:about="http://asynchronous.org/blog" />
    </ex:container>
  </is:aDescription>
</is:stuff>

Spec

Resources

Why?

RDF Triples in XML and A retrospective on the development of the RDF/XML Revised Syntax are both illuminating descriptions of various means and challenges with encoding RDF in XML. As well, A Brief History of RDF Serialization Formats is exactly as it claims.

As well, these documents generally outline the other "unstriped" approaches to representing RDF in XML. None of these approaches appeal to me presently, and they definitely do not appeal to a hypothetical data-format designer familiar with XML and contemplating taking on the excess constraints and capabilities associated with RDF.

RDF/XML has well-known problems. Specifically, striping can be a mental encumbrance, especially when data-typing is not always present or obvious. The various properties used for subject-, object- and intra-[blank-]node identification can be confusing.

TriX and RXR are too focused on triples, thus becoming even more verbose and harder to manually develop with than RDF/XML.

RPV, while novel and curious, also distracts the writer from the domain of the data representation.

BSWL is similar to RPV, in that it is a primarily-triples-focused syntax, with abbreviations for properties as XML element names. It also forces the user to switch between RDF and domain models.

N3-in-XML is similar to RX, while being different. It's very focused on N3, forces labeling of bnodes and uses whitespace in attribute values to seperate various concepts.

XENT is simply too divorced from the XML Infoset to be usable by pure-XML processing tools, including XML-only developers.

RX is very close to the strawman [RDF in XML syntax] by TimBL, which does not appear to have been developed further.

RX is nearly identical to strawman response Simplified Syntax for RDF by Sergey Melnik. In fact, the description of the algorithm is very close to that implemented in rx2nt.py, though I had not seen this proposal until after implementation.

Both RX and Melnik's Simplified Syntax are close to Jon Borden's Simplified XML Syntax for RDF, which isn't well-specified.

The XHTML Metainformation Module is another take on representing relatively constrained data-structures in XHTML2 documents, but shares some features in common with RX.

Why Not?

RX offers a very limited incremental benefit over RDF/XML, especially in machine-to-machine scenarios. I do believe, however, that it well addresses human-factors issues around development, debugging and adoption which should not be ignored.

The list syntax is still pretty onerous, and will put people off; the prohibition against data in attributes will put people off.

RDF exchanges some constraints (as well as liberties), for benefits. The primary issue with RDF adoption is that people don't like (the perception of) being constrained; another XML serialization won't change that.

One of the most important properties of an XML-focused RDF-in-XML syntax is predictability, especially with respect to non-RDF consumers (e.g. XSLT) that index content structurally. Because of the fundamental graph-vs-tree model mismatch, this predictability is both hard to specify, and even harder to guarantee. I've stopped short of attempting to do so in the RX spec -- though that was my original intent.

N3 and Turtle both offer compelling syntaxes for RDF which are extremely friendly to both reading and writing, as well as being easily machine parsable and having excellent representational capability.


authored: jsled, 2005.01.17