Forum OpenACS Development: Re: From database to JSON

Collapse
18: Re: From database to JSON (response to 1)
Posted by Gustaf Neumann on
Dear all,

to get data from the database straight to JSON, one should use the native support from the database.

JSON-Tcl processing contains two parts, parsing JSON and generating it. Parsing is for DB application just relevant, when the database returns JSON directly. Below are some figures from different implementations, including tdom, tjson, from the mongdb package in nsf (bson, supporting binary datatypes), and scripted Tcl

microseconds
tdom-parse 16,044.66
tjson-parse 11,543.77
bson-triple 51,567.74
tjson-typed 72,859.57
tjson-triple 72,118.74
json2dict 1,345,247.94

One can see here significant performance differences, which also depend on the result structure. While tdom and tjson return a handle to the internal structure, bson is transformed into a Tcl list structure containing triples of names, types and values, or json2dict returns a dict. One can also use tjson to produce the same triple notation (see "tjson-triple"). The differences between the C implemented versions are not huge (when comparing a similar richness of the output).

In order to generate JSON, from these notations, we see the following results:

microseconds
tdom-generate 9,714
tjson-generate 18,117
bson-generate 58,271

I've also looked at huddle (https://github.com/aplicacionamedida/huddle), using the json2huddle and "huddle jsondump" from there:

microseconds
huddle-parse 9,455,761
huddle-generate 4,394,388
Performance is not the primary reason to use huddle for this kind of application.

We can see, that the C-implemented versions are clear better (no big surprise). I am pretty sure, that rl_json performs similar. All of these have some advantages:

  • tdom is still very fast and established, it keeps the type information internally to achieve the appropriate JSON quoting.
  • json has in some areas advantages
  • bson has support for binary datatypes, but this is not important, when interfacing with PostgreSQL and Oracle). The triple notation makes it ease to provide types for the values on the Tcl level.
For OpenACS it is preferable to provide a common API to these backend packages, without losing much performance.

all the best
-gn

package require json
set jsonfile /usr/local/src/tjson/spec3.json ; set count 10
set fp [open $jsonfile r]; set JSON [read $fp]; close $fp
lappend results [list tdom-parse [time {dom parse -json $JSON test} $count]]]
lappend results [list tjson-parse [time {::tjson::parse $JSON} $count]]
lappend results [list bson-triple [time {::mongo::json::parse $JSON} $count]]
lappend results [list json-typed [time {set typed_spec [::tjson::json_to_typed $JSON]} $count]]
lappend results [list tjson-triple [time {set typed_spec [::tjson::json_to_typed $JSON]; ::tjson::typed_to_custom $typed_spec} $count]]
lappend results [list json2dict [time {::json::json2dict $JSON} $count]]

set T [dom parse -json $JSON test]
lappend results [list tdom-generate [time {$T asJSON} $count]]]

set TJ [::tjson::parse $JSON]
lappend results [list tjson-generate [time {::tjson::to_json $TJ} $count]]

set TRIPLE [::mongo::json::parse $JSON]
lappend results [list bson-generate [time {::mongo::json::generate $TRIPLE} $count]]