Ted Leung on the air: Open Source, Java, Python, and ...
- Seamless XML support. Never having to explicitly parse an XML document
- XPath as a native language construct
- Dynamic conversion between text and parsed representations of the XML.
- XPath manipulation for XML modifications
This topic is fresh on my mind, as I've just finished reading a series of papers on this very topic:
Martin Kempa and Volker Linnemann's PlanX '02 paper, On XML Objects describe XOBE a system implemented as a Java preprocessor, which aims to
eliminate the distinction between the string representation and the object representation of XML documents.Their approach generates classes for each of the elements in an XML Grammer (DTD or Schema) and allows for object literals that look syntactically like XML. XOBE also allows XPath expressions for querying the resulting object hierarchies.
Erik Meijer and Wolfram Schulte's OOPSLA 2003 submission: Unifying Tables, Objects, and Documents takes a different approach. Meijer and Schulte show how to extend C# (it could just as easily be Java) to deal with relational and XML data. They set forth a number of design principles for their experimental language, but two of the most important are:
- Denotable values should be (easily) expressible
- Expressible values should be denotable
C# is extended to support streams of various lengths (this can be done easily in languages which have generators, like Python). It is also extended to allow tuples (heterogeneous structures of optionally labelled variables of fixed length). Python has unlabelled tuples built in. The combination of streams and tuples is used to model relational data. The last supporting extension is union types.
Streams, tuples, and unions are the unused to model a large part of the XML Schema (XSD) type system. This means that the programmer can declare classes that correspond to XSD types. C# is also extended with XML literals which can be stored in instances of the appropriate class
In the area of querying, Meijer and Schulte have departed from XPath syntax. Instead, the mechanisms for looping / querying are taken from functional programming: lifting, apply-to-all, and folds. This provides a nice mechanism that works for streams (relational data) and XML data. These mechanisms support accessing fields/members of objects to provide path expression / XPath-like navigation. The authors also provide wildcard, transitive, and type-based member access. Wildcard access returns all the members of a type (in declaration order), transitive access allows you to find a member that is transitively reachable from some other member. Type-based access allows you to restrict the type of the member being (transitively) searched for (the restriction is reminiscent of XPath axis notation).
In addition to functional style querying, this extension of C# can also support a SQL like select-from-where clause. The cool thing that they point out is that it doesn't matter where the queried data is: it can be in memory or on disk. This is the beauty of the stream abstraction.
The same authors (plus one), Erik Meijer, Wolfram Schulte, and Gavin Bierman have a paper in XML 2003 Programming with Circles, Triangles and Rectangles, which focuses on the XML aspects of the language described in the previous paper. The experimental language is now being called Xen. This paper goes into a lot more detail on the mismatch between the XML data model and object data models. There's less on streams and tuples (and the type rules), a little more on lifting, and examples of handling the XQuery use cases in Xen.
Ned Batchelder points out that the version linked above is encoded in MSIE only html. Bierman has made a friendlier version available (at least it renders in Mozilla).
It seems clear that everyone agrees that denotable values should be easily expressible and that expressible values should be denotable. What's not as clear is what the query model should be for a language that has XML support baked in. XPath is an obvious choice, but then you only get the functionality for XML data, and I'd like to have the same kind of query capability for any hierarchical data. That's what I like about the Xen approach. Of course, it is possible to make XPath work over objects, ala the ObjectXPathNavigator or JXPath...
I'm glad to see this work being published / discussed. It seems to me that Xen is most likely to make an appearance in commercial products, since Erik Meijer works in the WebData group at Microsoft, which is a product, not a research group. If you want to stay in the statically typed curly brace language world, it looks like Microsoft is kicking the tires in all the right places. If Xen becomes C#2006, then Java will be sucking wind (at least in my book).

To me it seems too ambitious -- I'd rather see some more robust querying mechanisms in Python that were domain-specific before trying to unify it all across disparate domains.
Posted by Ian Bicking at Fri Oct 31 16:52:11 2003
Posted by James Strachan at Mon Nov 3 02:56:30 2003
http://dev2dev.bea.com/products/wlworkshop/articles/JSchneider_XML.jsp
Posted by Mike Dierken at Wed Feb 4 21:30:16 2004
!
Posted by Burak Emir at Wed Nov 2 04:45:25 2005

Add a comment here:
You can use some HTML tags in the comment text:
To insert a URI, just type it -- no need to write an anchor tag.
Allowable html tags are: <a href>
, <em>
, <i>
, <b>
, <blockquote>
, <br/>
, <p>
, <code>
, <pre>
, <cite>
, <sub>
and <sup>
.
You can also use some Wiki style:
URI => [uri title]
<em> => _emphasized text_
<b> => *bold text*
Ordered list => consecutive lines starting spaces and an asterisk