Forum OpenACS Development: XML to XoTCL Object
Check out "xotcl-soap". If you search for that or for "xosoap" on the forums, you'll find a decent bit to get you started. Also, here's an active thread where we're discussing installation and a bit of "which is the most suitable tool":
https://openacs.org/forums/message-view?message%5fid=1392533
Kind regards,
Justis
Is there already an api available that maps xml records to xotcl objects?
From my side (which is mostly xosoap's), there is on-going work to provide something that is close to what you are looking for. The key idea is to provide some support for bridging xml schema and schema instances to the XOTcl object system (and vice versa). The result will be a dwarfish brother to the plethora of "xml data binding" frameworks available, just to name a few (predominantly created for the Javaverse):
- Axiom: http://ws.apache.org/commons/axiom/
- Castor: http://castor.codehaus.org/
- XMLBeans: http://xmlbeans.apache.org/
However, this issue is certainly NOT TRIVIAL. For a fairly comprehensive (and practioner-oriented) overview, see the article on Revealing the X/O impedance mismatch. A rough conclusion is that no framework support can be (a) generic and, therefore, needs to be build around (b) specific use cases; remoting (xosoap) being one.
Anyway, as for the current status:
- My current code work is just an implementation study, I am still fiddling around with major details. I can hardly promise something operative in the very near future. I would say that I have not tackled more than 40% of the work needed overall, ...
- However, I am not re-inventing the wheel: I re-use the approach of xotcl-core's html procs, i.e. tdom based inline scripting, for "deferred building" of documents. Also, xotcl comes with powerful language features/ idioms that align nicely to the design requirements. This is particularily valid for xotcl's slots, or attribute slots to be specific.
- I am concentrating on providing a basic infrastructure to cover more specific details of x/o mappings when needed.
- Finally, there the basic facilities should be linked to the existing xotcl-core features, such as the ACS Object infrastructure, db stubs, widgets etc.
Again, there is no concrete code artifact I could easily share, it is just a rough outlook. However, maybe there is something in Gustaf Neumann's arsenal that I am not aware of and that fits nicely in your scenario?!
My personal view of objects is that they are entirely data. Methods on objects are distinct from the objects, or at least, behavior is entirely determined by the data content of an object.
If this view is accepted, then any Tcl representation of the content of an XML element/document is valid. Since XML documents don't have methods, this seems particularly valid in this case.
The tWSDL/TWiST package has a very tiny module to translate an XML document into a nested Tcl namespace. There is a companion set of procedures to serialize the structure back into an XML document. The input is a tDOM toList of the XML document.
Check it out if you wish.
The X/O article is a very informative introduction to the issues involved. However, I think it is important to point out that OpenACS doesn't have to deal with the entire set of problems. There is a useful subset of issues, some outlined in the article, where mapping can be more easily achieved.
For the foreseeable future, tDOM will serve as the basis of working with XML documents in Tcl. However, I find it useful mostly for the initial parsing of the XML. Parsing is a char-by-char operation and can't be done on the cheap. Beyond this, I find that the opaqueness of the dom objects to be a major constraint. In addition, all of the document structure is beyond the Tcl language, so it is hard to claim that any mapping of XML to Tcl has taken place in tDOM. Sticking with tDOM, essentially means not solving the problem.
There are some limitations to tDOM.
Most notable is that it cannot validate using and XML-Schema. At first this might be considered acceptable, but it quickly leads to the second limitation:
You cannot construct types, or auto-generate API to create valid types based upon input data (handling missing data is difficult, and without a schema, impossible). Also, without a schema, every hand written API must include structural details (element/attribute names). This in turn makes it impossible to use abstract types which assume the name of the new type but reuse the substructure.
When working on tWSDL, the first problem I needed to solve was the ability to fully represent an XML document in Tcl, at least for a useful subset of XML (minus PI and DTD). This meant that each part of the XML document would be accessible from a Tcl script and that the representation could be serialized into an XML document/element. The X/O article calls this round-tripping.
But another requirement, called XML Fidelity was also critical. There is a long list of features here, mostly programmatic. How easy is it to construct and manipulate an internal representation of an XML document? The paper gives poor marks for the C# tool used as a representative mapper. Compare this to tWSDL:
Facet C# O/T/C tW O/T/C flat structure + + + + + + nested struct + ? ? + + + mixed content ? ? ? + + + whitespace - - - + + + xml comments - - - + + + processing inst - - - - - - attribute order - - - - - - namespace prefix - - - + + + DTD - - - - - -
Round-tripping means, at minimum the ability to de-serialize an arbitrary XML document into an object structure and then serialize it in a form with identical meaning. The C# tool was unable to keep elements in the order in which they were found in the original document. tWSDL also does not reproduce the exact document, but only in three respects:
1. attribute order is not preserved, and it isn't required to be in any standard.
2. whitespace within elements 'without' complexContent is normalized and indented based upon structure.
3. no handler for comments (easy to fix).
The most immediately available set of complex XML documents are WSDL documents. Since these can form the basis of automatic programming, being able to represent them and read them 'automatically' is important. The unfinished TWiST client has a simple procedure for downloading and loading a WSDL into a Tcl rep:
proc ::wsclient::getWSDL { service url } { if {[string match "https://*" "$url"]} { set result [ns_httpsget $url] } else { set result [ns_httpget $url] } # to dom doc dom parse $result wsdlDoc $wsdlDoc documentElement wsdlRoot set wsdlNS ::wsclient::${service}::wsdl set ::wsclient::${service}::definitions $result ::xml::instance::newXMLNS $wsdlNS [$wsdlRoot asList] "1" rename $wsdlDoc "" set ::wsclient::${service}::wsdlURL $url return $wsdlNS }
I have a testing tool which can download any WSDL, parse and display the content of the Tcl structure: http://junom.com/document/openacs/wsclient/. It also pulls out some data. The input needs to be a WSDL as some procs will error out looking for document components. But notice that the getWSDL proc only uses tDOM to parse and create a list. After that, regular Tcl commands can be used to pull information out of the document.
However, I think it is important to point out that OpenACS doesn't have to deal with the entire set of problemsi would not make such a claim about the needs of the framework in general. However, i agree that in many cases, a much more simple approach is sufficient. An important aspect is to know in which case, which subset is sufficient.
... You cannot construct types, or auto-generate API to create valid types based upon input data (handling missing data is difficult, and without a schema, impossible). ...Again, i differ. The original thread was about XOTcl objects, which support attribute slots. Attribute slots are objects containing meta-data and methods for instance attributes, such as for example defaults, labels, datatypes, multiplicity, etc. In case of the classes derived from acs-object-types, the attribute slots contain the datatypes from acs-attributes. Therefore, with a one-line change, one gets from my script posted above for the acs-object-type "user" (from which the class ::xo::db::user can be automatically derived, see https://openacs.org/forums/message-view?message_id=1165841) the following type definition for XML:
<xs:complexType name="xo_.._db_.._user"> <xs:sequence> <xs:element name="object_id" type="xs:integer"/> <xs:element name="object_title" type="xs:text"/> <xs:element name="email" type="xs:string"/> <xs:element name="party_id" type="xs:integer"/> <xs:element name="url" type="xs:string"/> <xs:element name="first_names" type="xs:string"/> <xs:element name="last_name" type="xs:string"/> <xs:element name="person_id" type="xs:integer"/> <xs:element name="user_id" type="xs:integer"/> </xs:sequence> <xs:attribute name="name" type="xs:string"/> </xs:complexType>... and for the user_id 539 the following xml instance:
<xo_.._db_.._user name="::539"> <object_id>539</object_id> <object_title>Gustaf Neumann</object_title> <email>neumann@wu-wien.ac.at</email> <party_id>539</party_id> <url></url> <first_names>Gustaf</first_names> <last_name>Neumann</last_name> <person_id>539</person_id> <user_id>539</user_id> </xo_.._db_.._user>This approach provides certainly no general mapping between OO and XML, but for the domain of mapping/round-tripping acs-objects over XOTcl objects to XML and vice versa a reasonable step, especially, since the oo-mapping of xotcl-core provides a unified interface for storing/retrieving the data. Again, this is just a first step, one might wish to eliminate for example the ID-columns from the xml schema and/or instance, etc.
I fail to see what the unfinished TWiST client example helps for the original question of Nima, beyond the (trivial) observation that tDOM contains already an interface to round-trip XML structures into Tcl structures and vice versa ("domNode asList" and "appendFromList"). As mentioned above, for most applications i find it more convenient rather to work on an XML document via the DOM interface rather than via the nested list structures generated from asList.
Huh? The tDOM 'asList' is an input:
::xml::instance::newXMLNS $wsdlNS [$wsdlRoot asList] "1"
The above proc creates a nested namespace representation of the XML document. Then the document can be validated against an XML Schema and/or data extracted without using xpath/xquery.
And serialization to XML is not handled by tDOM. That means I can create XML documents without tDOM, serialize them without tDOM, validate them without tDOM. With tDOM, you can't introspect the document, and the global namespace gets polluted with doc and node commands, and did I mention validation? tDOM does not validate using an XML-Schema.
For the simple example which you gave, this proc works well for extracting the child values into an array:
proc ::xml::childElementsAsArray { namespace arrayName } { set ChildElements [set ${namespace}::.PARTS] upvar $arrayName ChildArray foreach ChildElement $ChildElements { foreach {ChildType prefix ChildPart} $ChildElement {} set ChildPart [::xml::normalizeNamespace $namespace $ChildPart] lappend ChildArray($ChildType) [::xml::instance::getTextValue $ChildPart] } }
As far as then saving this 'object', I'll walk through all the steps:
::xml::childElementsAsArray ::obj::ns userArray qw_new users userArray
Maybe I didn't answer Nima's question. Why do you care? Everyone else talked about how difficult it would be to answer 'yes' at some point in the future. Is it wrong for me to mention that a mapping can be done? Has been done? And works? Sorry it isn't to XOTcl objects. Since the code I'm talking about is finished and stable and has nothing to do with the progress on the TWiST client, who cares about this? That is the point, nobody has to wait around for me to do anything. The reason I used a WSDL file as an example is because of the high level of complexity in representing, reading and creating them. The tiny XML package I wrote makes the job fairly easy, no more difficult than using tDOM commands, in some ways much easier as certain details are handled automatically.
Then the document can be validated against an XML Schema and/or data extracted without using xpath/xqueryDo you say that you have implemented an XML Schema validator? That sounds like a useful contribution! Can you explain more?
.... And serialization to XML is not handled by tDOM ...What is wrong about using appendFromList?
With tDOM, you can't introspect the documentWhat can be done with the namespace tree you are producing that can't be done via DOM (with a well known interface)?
For getting the child elements, [$root childNodes] seems simpler to me.
Do you say that you have implemented an XML Schema validator? That sounds like a useful contribution! Can you explain more?
I'm not sure of the terminology. This code is all part of a type system which is relatively independent from the tWSDL/TWiST code. The type code can create types using the methods of restriction outlined in XML-Schema datatypes:
http://www.w3.org/TR/xmlschema-2/
and these can be combined into complexTypes, mostly limited to sequences:
http://www.w3.org/TR/xmlschema-1/
Once the types are defined, then instances can be created with the ::new proc (which supports full nesting to any depth).
The defined schema is part of the WSDL file, but could probably be pulled out as an independent document. Documents/instances can be validated using the ::validate proc. The validation does structural validation then type validation. Validation stops on the first error and backs out, marking a path (using internal variables). Right now the only use for this validation information is to create a SOAP fault indicating the reason for failure, but maybe you could go in and correct something and revalidate.
The best way to understand all of this is to just look at the auto-generated code for handling all of this:
http://junom.com/ws/mywebservice/
The entire auto-generated code is based upon the config:
http://junom.com/ws/mywebservice/index.txt
The auto-generated WSDL/schema:
http://junom.com/ws/mywebservice/?WSDL
The derivation of decimal types is complex to code, but very useful. The example operation is at:
http://junom.com/ws/mywebservice/?op=TestDecimalValueOperation&mode=display
The generated type validation procedure is here:
http://junom.com/ws/mywebservice/?ns=::wsdb::types::mywebservice::TestDecimal
In general, you can browse under either simpleTypes or complexTypes to see the various code for either type creation or type validation.
What is wrong about using appendFromList?
There is nothing wrong with it, however it would mean several inconveniences. The first is it would require tDOM. Second, during building the document, additional information is added which aids validation (see above examples). Also, invalid documents are marked with additional metadata which isn't part of the XML document. Another problem is that you can't easily introspect the document to figure out why you can't get something to work. I use a namespace browser to inspect documents and all code generated by the service description. Tracking down data errors in tDOM would be difficult (for me at least). Also, some documents are constructed in parts. This means they need to be created before their parents, even before the document. This is easy, since the creating proc only needs to return the location (namespace) of the new element. This name can be used as a reference to add the element.
So documents/elements are sometimes looked over more than once. One additional example is the problem of serialization. I'm using only document/literal, so there is minimum need to worry about it, but rpc/(encoded/literal) are only slightly different from document/literal. In many cases, only a type attribute needs to be added to elements. Since the SOAP layer is above the document creation layer, this decision cannot be made until after the document is filled with data. Some future RPC layer could use the schema and go back and add the type information, possibly adding prefixes, etc.
There are example Tcl reps of XML documents under the ::xml::instance namespace:
http://junom.com/ws/mywebservice/?ns=::xml::instance
Finally, there is nothing wrong with tDOM, I would never suggest that, but you can't get any better introspection than simply browsing the document. This type of introspection is aimed at finding errors and understanding what is going on. Point and click is easier on my brain.
me:An XML Schema validator is a program that accepts an XML Schema (as defined by w3c) and a schema instance (an XML document), analyzes both and determines if the XML document is a valid instance of the schema (well-formed by XML rules and obeying the structural and datatypes constraints by the XML Schema).Do you say that you have implemented an XML Schema validator?Tom:
I'm not sure of the terminology.
If i understand correctly, your implementation parses the XML-schema (from the WDSL document) into the namespace-structure and auto-generates the type-checkers for the primitive and derived XML datatypes (terminology of http://www.w3.org/TR/xmlschema-2/).
Does your implementation cover the full set of built-in datatypes of XML Schema, the full set of built-in derived datatypes, and the full set of simple type definitions (section 4)? I understand, that you have not implementd complex type checking (yet).
This is certainly an interesting addition to a tcl based xml processing environment, complementing tdom.
Concerning introspection: The term introspection is used in computer science for the ability of a program to query at runtime its own structure and behavior (and optionally to modify it; using terms "read introspection" and "write introspection"). Another term for this is "reflection". This ability is particular important for dynamic languages, where the structures (e.g. object-class relationships, class-class relationships, adding methods or variables dynamically, ...) might change during runtime. By this meaning of the term, tdom has read and write introspection; i would call you usage of the term rather xml structure browsing.
Gustaf,
From the user's perspective, the xml component of tWSDL does document validation based upon a defined XML Schema. However, there is more going on, and there is not a single step which qualifies quite by this definition.
First, types are defined via a Tcl API and the XML Schema is generated based upon the definitions. If this were not the case, then all the xsd types would have to be hand coded, or would they not even exist until you parsed an xsd which defined them? The link below shows how the xsd types are derived:
http://junom.com/gitweb/gitweb.perl?p=twsdl.git;a=blob;f=packages/wsdl/ns/ns-xsd.tcl
The page needs some cleanup, but basically it outlines how the internal API generate the type system. I'll include a few lines here:
# Create xsd schema ::wsdl::schema::new xsd "http://www.w3.org/2001/XMLSchema" # anySimpleType ::wsdl::types::primitiveType::new xsd anySimpleType {return 1} # string ::wsdl::types::primitiveType::new xsd string {return 1} # dateTime ::wsdl::types::primitiveType::new xsd dateTime "return \[::wsdb::types::tcl::dateTime::toArray \$value\]" # duration ::wsdl::types::primitiveType::new xsd duration "return \[::wsdb::types::tcl::dateTime::durationToArray \$value\]" # boolean ::wsdl::types::simpleType::restrictByEnumeration xsd boolean xsd::string {0 1 true false} # Decimal Type ::wsdl::types::simpleType::restrictDecimal xsd decimal xsd::string {pattern {\A(?:([\-+]?)([0-9]*)(?:([\.]?)|([\.])([0-9]+))){1}\Z}} ::wsdl::types::simpleType::restrictDecimal xsd integer tcl::integer {fractionDigits 0} ::wsdl::types::simpleType::restrictDecimal xsd int tcl::integer {fractionDigits 0} ::wsdl::types::simpleType::restrictDecimal xsd nonPositiveInteger xsd::integer {maxInclusive 0} ::wsdl::types::simpleType::restrictDecimal xsd negativeInteger xsd::integer {maxInclusive -1} ::wsdl::types::simpleType::restrictDecimal xsd short xsd::integer {minInclusive -32767 maxInclusive 32767} ::wsdl::types::simpleType::restrictDecimal xsd byte xsd::integer {minInclusive -127 maxInclusive 127}
Structural types are supported as sequences. You can specify minOccurs, maxOccurs, type, default value, nillable, and implicitly, the order of child elements. Children can be either simpleType content or another complexType. You can also have a child element with a local name which refers to a global type. The code for creating and validating the type is in the global type, but the local element provides the name and reference to the global type.
The structural details are checked first during validation. If all children are present in the correct number, validation steps through the child elements until eventually children with only simpleType content are validated.
If there is a validation error, the validation checker marks the 'nodes' on the way back out so a client can get a complete path to the error and a pretty good error message indicating what failed. In tWSDL, this is used to return a SOAP fault message (client fault). However, any application could easily access the same information. The error information is stored with the document rep, but it isn't part of any serialized version of the document (at least with XML).
http://junom.com/document/tWSDL/api-types.html
Also, the XML API (now in ::xml, not ::wsdl) is explained here along with discussion of tDOM usage. The examples are no longer completely accurate, but they are very close. Considering these docs were written almost two years ago, it is surprising how close the final code came to the plans.
http://junom.com/document/tWSDL/api-xml.html
The main difference is the addition of prefix information into the metadata along with the ability to add child elements by reference (instead of actual namespace children). Also, all of the XML elements are created through a single API. This frees future developers from needing to know the bookeeping details for element metadata, and any changes to the internal details will not affect any developer code.
The next development push will probably be to add support for attributes. That means the ability to define them either globally or locally, associate them with any global simpleType, and include them in document validation. Attribute validation and defaulting should be much easier than what is required for element validation. Internally I intend to use attributes to indicate information about the element content, for instance if/how the content is encoded.
As other have said already, the are many possible mappings, some of these are for some examples more or less convenient. In most cases, there is little need to map the full DOM structure into XOTcl objects, because tdom provides already a powerful interface. Stefan's pointer to the two papers is a great introduction to approaches for OO/XML mappings.
A central question is whether you have control over the XML types (e.g. you are able to specify the XML schema) or you want to map given XML structures to given XOTcl classes/objects.
For the first case, the small program below might help you as a start.
For the second case, i would recommend to look into the RSS-client class (in xowiki/tcl/syndicate-procs.tcl in CVS head) which is used to map various forms of RSS files into xotcl objects based on a set of xpath queries. Most likely this can be generalized by providing xpath queries as slot attributes.
Below is a study, how to map arbitrary XOTcl objects into XML and how to generate again from the XML file the XOTcl objects. Note that this works as well with all acs objects, when using the mappings from xotcl-core. It will create the objects in memory, one has to use save/save_new to persist it in the database. Note, however, that the import/export exports as well all IDs which are of limited used when exchanging objects between systems. To address these problems, one might look into the import/export facilities of xowiki.
After loading the code underneath, on can run the following example:
# # Define two simple classes with two sample instances # Class C -parameter {x y} C c1 -x 1 -y 10 Class D -superclass C -parameter {{z 100}} D d1 -x 2 -y 20 # # Get the XML representation of the two objects # set XML [xml getXML c1 d1] # Destroy the objects from memory d1 destroy c1 destroy # # Get the XML Schema definition for the two classes # set XMLS [xml getSchema ::C ::D] # # Parse the XML file and obtain the objects from the # contents. # xml getObjects $XML ns_log notice c1=[info command c1] ns_log notice c1=[c1 info class]
Note that this code is just a study, for real world usage, one should provide an interface for named/unnamed objects, class mappings, namespace fiddling (both xml and tcl).
###################################################### # # The method allslots computes the set of all slots of a class, # including the slots of the classes of the full type hierarchy. # Class instproc allslots {} { set slots [my info slots] # remember slotnames foreach slot $slots {set slotname([namespace tail $slot]) 1} # iterate over class structure foreach c [my info heritage] { foreach slot [$c info slots] { set key slotname([namespace tail $slot]) # don't add slots which are already defined in more specialized classes if {[info exists $key]} continue set $key 1 lappend slots $slot } } # return slot objects always in same order return [lsort $slots] } # # The object xml implements import and export of XOTcl objects # via XML. # # The most important methods are # # - getSchema... # # returns an xml schema derived from the XOTcl slot structure of the specified # XOTcl classes. Currently, it defines all attributes as string values. # It could use the as well the database types kept in the db-slots in xotcl-core. # # - getXML ... # # returns an XML representation of the specified XOTcl objects. The # generated XML text contains only the instance attributes defined # via slots (e.h. parameters) # # - getObjects # # parses the specified XML text and creates the XOTcl objects from # the contents of the XML file. # # Object xml xml set schemaName http://your.host.net/xotcl xml set schemaFile http://your.host.net/xotcl.xsd xml proc tcl_to_xml {name} {regsub -all :: [string trimleft $name :] _.._ name; return $name} xml proc xml_to_tcl {name} {regsub -all _\.\._ $name :: name;return $name} xml proc getSchema args { my instvar schemaName dom createDocument xs:schema doc $doc documentElement root $root setAttribute targetNamespace $schemaName $root setAttribute xmlns $schemaName $root setAttribute xmlns:xs "http://www.w3.org/2001/XMLSchema" foreach class $args { set node [$doc createElement xs:complexType ] $node setAttribute name [my tcl_to_xml $class] $root appendChild $node set seq [$doc createElement xs:sequence ] $node appendChild $seq foreach slot [$class allslots] { $seq appendFromList [list xs:element [list name [$slot name] type xs:string] {}] } set att [$doc createElement xs:attribute ] $att setAttribute name name $att setAttribute type xs:string $node appendChild $att } set node [$doc createElement xs:complexType ] $node setAttribute name xotcl $root appendChild $node set seq [$doc createElement xs:sequence ] $node appendChild $seq set choice [$doc createElement xs:choice ] $seq appendChild $choice foreach class $args { set ncName [my tcl_to_xml $class] $seq appendFromList [list xs:element [list name $ncName type $ncName] {}] } return [$root asXML] } xml proc getXML args { my instvar schemaName schemaFile dom createDocument xotcl doc $doc documentElement root $root setAttribute xmlns $schemaName $root setAttribute xmlns:xsi "http://www.w3.org/2001/XMLSchema-instance" $root setAttribute xsi:schemaLocation "$schemaName $schemaFile" foreach o $args { set node [$doc createElement [my tcl_to_xml [$o info class] ]] $root appendChild $node $node setAttribute name [$o self] foreach slot [[$o info class] allslots] { set name [$slot name] $node appendFromList [list $name "" [list [list #text [$o $name]]]] } } return [$root asXML] } xml proc getObjects {XML} { my instvar schemaName set objects [list] dom parse $XML doc $doc documentElement root $root setAttributeNS "" xmlns:default [$root getAttribute xmlns] foreach node [$root selectNodes /default:xotcl/*] { set command [list [my xml_to_tcl [$node nodeName]] create [$node getAttribute name]] foreach att [$node childNodes] { lappend command [list -[$att nodeName] [$att text]] } lappend objects [eval $command] } return $objects }