Forum OpenACS Development: Re: util_httppost

Collapse
10: Re: util_httppost (response to 9)
Posted by Antonio Pisano on
This is a stub of an http GET proc based on the conclusions of this discussion.

It switches to ssl, when available on naviserver, based on the url issued, or on a flag.

If response appears to be gzipped, returns the gunzipped output.

All this procs would go into util::http:: namespace. http:: namespace is not good because conflicts with tcl's.

For Aolserver, I plan to use a conditional definition of the proc, which will use curl.

Here it is, if the direction is right I will go on with the rest, then I'll collect all http procedural api into one file and deprecate the old one.

namespace eval util {}
namespace eval util::http {}

ad_proc util::http::get {
    -url 
    {-headers ""} 
    {-timeout 30}
    {-depth 0}
    {-force_ssl f}
} {
    Issue an http GET request to url.
Switches to SSL whenever encounters an 'https' url.
If force_ssl is set to true, ssl will be used also for 'http://' urls
Returns the data in array get form with array elements page, status, and modified. } { set this_proc [info level 0] if {![regexp "(https|http)://*" $url]} { return -code error "${this_proc}: Invalid url: $url" } set max_depth 10 if {[incr depth] > $max_depth} { return -code error "${this_proc}: Recursive redirection: $url" } # Check wether we will use ssl or not if {$force_ssl || [regexp "https://*" $url]} { if {[info commands ns_ssl] eq ""} { return -code error "${this_proc}: SSL not enabled: $url" } set http_api "ns_ssl" } else { set http_api "ns_http" } set cmd {$http_api queue -method GET -timeout $timeout} # empty header would throw an error if {$headers ne ""} { lappend cmd -headers $headers } # Queue call to the url and wait for response set resp_headers [ns_set create resp_headers] $http_api wait -result page -status status -headers $resp_headers [eval "$cmd $url"] # Get values from response headers, then remove them set content_encoding [ns_set iget $resp_headers content-encoding] set location [ns_set iget $resp_headers location] set last_modified [ns_set iget $resp_headers last-modified] ns_set free $resp_headers # Redirection... if {$status == 302 || $status == 301} { if {$location ne ""} { return [${this_proc} -url $location -force_ssl $force_ssl -headers $headers -timeout $timeout -depth $depth] } else { return -code error "${this_proc}: Redirection without location: $url" } # Page not modified since date specified... } elseif {$status == 304} { set page "" } # If output is gzipped, try decompression... if {$content_encoding eq "gzip"} { # ...first using naviserver API... if {[info commands ns_zlib] ne ""} { set page [ns_zlib uncompress $page] # ...then tcl's (from 8.6) } elseif {[info commands zlib] ne ""} { set page [zlib decompress $page] } } return [list \ page $page \ status $status \ modified $last_modified] }
Collapse
11: Re: util_httppost (response to 10)
Posted by Gustaf Neumann on
Yep, the direction looks right to me!

Why is the force_ssl here? using "http" vs. "https" should be enough. i would see no big benefit on using e.g. "-url http://foo -force_ssl t" over "-url https://foo";.

For large requests, allowing to spool to a file make sense (otherwise 2GB would be the upper limit, since evey Tcl variable can be at most 2 GB large). Additionally, this allows then still a small memory footprint.

Collapse
12: Re: util_httppost (response to 11)
Posted by Antonio Pisano on
I put force_ssl option because I clearly remembered about webservices having http url and requiring SSL. Strange enough...

I enhanced procs with file spooling, but on my installation it didn't work out... does it require some configuration to be enabled? I left the feature disabled by a single commentable line of code.

I have added an util::http::post proc to handle POSTing of form vars and/or files. Many parts of the old util_http_file_upload from Michael Cleverly came out very useful and I could conserve them in the new one. Some time ago I had already enhanced that very proc for my former company, so it could send more than one file, even for single form file fields allowing multiple values.

This is the new tcl file for http client functionalities. I leave it here for revision and approval.

ad_library {

    Procs for http client comunication

    @author Antonio Pisano
    @creation-date 2014-02-13
}

namespace eval util {}
namespace eval util::http {}

ad_proc util::http::get {
    -url 
    {-headers ""} 
    {-timeout 30}
    {-depth 0}
    -force_ssl:boolean
    {-spool_file ""}
} {
    Issue an http GET request to url.
Switches to SSL whenever encounters an 'https' url.
If force_ssl is set to true, ssl will be used also for 'http://' urls
Returns the data in array get form with array elements page, status, and modified. } { set this_proc [info level 0] if {![regexp "(https|http)://*" $url]} { return -code error "${this_proc}: Invalid url: $url" } set max_depth 10 if {[incr depth] > $max_depth} { return -code error "${this_proc}: Recursive redirection: $url" } # Check wether we will use ssl or not if {$force_ssl_p || [regexp "https://*" $url]} { if {[info commands ns_ssl] eq ""} { return -code error "${this_proc}: SSL not enabled: $url" } set http_api "ns_ssl" } else { set http_api "ns_http" } # Spooling to files is disabled for now set spool_file "" set queue_cmd {$http_api queue -timeout $timeout -method GET} # empty header would throw an error if {$headers ne ""} { append queue_cmd " -headers $headers" } if {$spool_file ne ""} { append cmd " -spoolsize 0 -file $spool_file" set page "${this_proc}: response spooled to '$spool_file'" } set queue [eval "$queue_cmd $url"] # Queue call to the url and wait for response set resp_headers [ns_set create resp_headers] set wait_cmd {$http_api wait -status status -headers $resp_headers} if {$spool_file eq ""} { append wait_cmd " -result page" } eval "$wait_cmd $queue" # Get values from response headers, then remove them set content_encoding [ns_set iget $resp_headers content-encoding] set location [ns_set iget $resp_headers location] set last_modified [ns_set iget $resp_headers last-modified] ns_set free $resp_headers # Redirection... if {$status == 302 || $status == 301} { if {$location ne ""} { return [${this_proc} -url $location -force_ssl_p $force_ssl_p -headers $headers -timeout $timeout -depth $depth -spool_file $spool_file] } else { return -code error "${this_proc}: Redirection without location: $url" } # Page not modified since date specified... } elseif {$status == 304} { set page "" } # If output is gzipped, try decompression... if {$content_encoding eq "gzip"} { # ...first using naviserver API... if {[info commands ns_zlib] ne ""} { set page [ns_zlib uncompress $page] # ...then tcl's (from 8.6) } elseif {[info commands zlib] ne ""} { set page [zlib decompress $page] } } return [list \ page $page \ status $status \ modified $last_modified] } ad_proc util::http::post { {-files ""} {-datas ""} -base64:boolean {-filenames {}} {-names {}} {-mime_types {}} {-mode formvars} {-headers ""} -url {-formvars {}} {-timeout 30} {-depth 0} -force_ssl:boolean {-spool_file ""} } { Implement client-side HTTP POST with file uploads. When files are specified for upload, form will be a multipart/form-data, otherwise it will be sent as application/x-www-form-urlencoded. Setting headers for 'multipart/form-data' allow to force the kind of form that will be sent.

The switches -files {/path/to/file /path/to/second-file ... } and -datas {$raw_data_1 $raw_data_2 ...} are mutually exclusive. You can specify one or the other, but not both. NOTE: it is perfectly valid to not specify either, in which case no file is uploaded, but form variables are encoded using multipart/form-data instead of the usual encoding (as noted aboved).

If you specify either -files or -datas you must supply a value for -names, which is the list of names of the respective <INPUT TYPE="file" NAME="..."> form tag.

Specify the -base64 switch if the file (or data) needs to be base-64 encoded. Not all servers seem to be able to handle this. (For example, http://mol-stage.usps.com/mml.adp, which expects to receive an XML file doesn't seem to grok any kind of Content-Transfer-Encoding.)

If you specify -files then -filenames is optional (it can be infered from the name of the file). However, if you specify -datas then it is mandatory.

If -mime_types is not specified then ns_guesstype is used to try and find a mime type based on the filename. If ns_guesstype returns */* the generic value of application/octet-stream will be used.

Any form variables may be specified in one of four formats:

  • array (list of key value pairs like what [array get] returns)
  • formvars (list of url encoded formvars, i.e. foo=bar&x=1)
  • ns_set (an ns_set containing key/value pairs)
  • vars (a list of tcl vars to grab from the calling enviroment)

-headers specifies an ns_set of extra headers to send to the server when doing the POST.

-timeout and -depth, are optional. When POSTing, we are not following redirects, but depth is passed to util::http::get when a redirect happens } { set this_proc [info level 0] if {![regexp "(https|http)://*" $url]} { return -code error "${this_proc}: Invalid url: $url" } set max_depth 10 if {[incr depth] > $max_depth} { return -code error "${this_proc}: Recursive redirection: $url" } # Check wether we will use ssl or not if {$force_ssl_p || [regexp "https://*" $url]} { if {[info commands ns_ssl] eq ""} { return -code error "${this_proc}: SSL not enabled: $url" } set http_api "ns_ssl" } else { set http_api "ns_http" } # sanity checks on switches given if {[lsearch -exact {formvars array ns_set vars} $mode] == -1} { return -code error "${this_proc}: Invalid mode \"$mode\"; should be one of: formvars, array, ns_set, vars" } set variables [list] switch -- $mode { array { set variables $formvars } formvars { foreach formvar [split $formvars &] { set formvar [split $formvar =] set key [lindex $formvar 0] set val [join [lrange $formvar 1 end] =] lappend variables $key $val } } ns_set { for {set i 0} {$i < [ns_set size $formvars]} {incr i} { set key [ns_set key $formvars $i] set val [ns_set value $formvars $i] lappend variables $key $val } } vars { foreach key $formvars { upvar 1 $key val lappend variables $key $val } } } if {$headers eq ""} { set headers [ns_set create headers] } set req_content_type [ns_set iget $headers "Content-type"] set multipart_p [regexp "multipart/form-data" $req_content_type] # We have files to be uploaded, this will be a 'multipart/form-data' request if {$multipart_p || ($datas ne [list] && $files ne [list])} { if {$files ne "" && $datas ne ""} { return -code error "${this_proc}: -files and -datas are mutually exclusive; can't use both" } if {$files ne ""} { foreach file $files filename $filenames mime_type $mime_types { if {![file exists $file]} { return -code error "${this_proc}: Error reading file: $file not found" } if {![file readable $file]} { return -code error "${this_proc}: Error reading file: $file permission denied" } set fp [open $file] fconfigure $fp -translation binary lappend datas [read $fp] close $fp if {$filename eq ""} { lappend filenames [file tail $file] } if {$mime_type eq ""} { lappend mime_types [ns_guesstype $file] } } } set boundary [ns_sha1 [list [clock clicks -milliseconds] [clock seconds]]] ns_set put $headers "Content-type" "multipart/form-data; boundary=$boundary" set payload {} if {$datas ne ""} { if {[llength $datas] != [llength $names]} { return -code error "${this_proc}: Cannot upload file without specifing form variable -name" } if {[llength $datas] != [llength $filenames]} { return -code error "${this_proc}: Cannot upload file without specifing -filename" } foreach data $datas filename $filenames name $names mime_type $mime_types { if {$mime_type eq ""} { set mime_type [ns_guesstype $filename] if {[string equal $mime_type */*] || $mime_type eq ""} { set mime_type application/octet-stream } } if {$base64_p} { set data [base64::encode base64] set transfer_encoding base64 } else { set transfer_encoding binary } append payload --$boundary \ \r\n \ "Content-Disposition: form-data; " \ "name=\"$name\"; filename=\"$filename\"" \ \r\n \ "Content-Type: $mime_type" \ \r\n \ "Content-transfer-encoding: $transfer_encoding" \ \r\n \ \r\n \ $data \ \r\n } } foreach {key val} $variables { append payload --$boundary \ \r\n \ "Content-Disposition: form-data; name=\"$key\"" \ \r\n \ \r\n \ $val \ \r\n } append payload --$boundary-- \r\n # No files to upload, this will be an 'application/x-www-form-urlencoded' request } else { ns_set put $headers "Content-type" "application/x-www-form-urlencoded" set exp_vars [list] foreach {key val} $variables { lappend exp_vars [list $key $val] } set payload [export_vars $exp_vars] } # Spooling to files is disabled for now set spool_file "" set queue_cmd {$http_api queue -timeout $timeout -method POST -body $payload -headers $headers} if {$spool_file ne ""} { append queue_cmd " -spoolsize 0 -file $spool_file" set page "${this_proc}: response spooled to '$spool_file'" } set queue [eval "$queue_cmd $url"] set resp_headers [ns_set create resp_headers] set wait_cmd {$http_api wait -status status -headers $resp_headers} if {$spool_file eq ""} { append wait_cmd " -result page" } # Queue call to the url and wait for response eval "$wait_cmd $queue" # Get values from response headers, then remove them set content_encoding [ns_set iget $resp_headers content-encoding] set location [ns_set iget $resp_headers location] set last_modified [ns_set iget $resp_headers last-modified] ns_set free $resp_headers # Redirection for a POST request is normal, just follow with GET if {$status == 302 || $status == 301} { if {$location ne ""} { return [util::http::get -url $location -force_ssl_p $force_ssl_p -headers $headers -timeout $timeout -depth $depth -spool_file $spool_file] } else { return "" } # Page not modified since date specified... } elseif {$status == 304} { set page "" } # If output is gzipped, try decompression... if {$content_encoding eq "gzip"} { # ...first using naviserver API... if {[info commands ns_zlib] ne ""} { set page [ns_zlib uncompress $page] # ...then tcl's (from 8.6) } elseif {[info commands zlib] ne ""} { set page [zlib decompress $page] } } return [list \ page $page \ status $status \ modified $last_modified] }

Collapse
13: Re: util_httppost (response to 12)
Posted by Gustaf Neumann on
It is developing nicely. A few comments:
- the spoolsize options were added in aug last year, after the release of NaviServer 4.99.5; you can test spooling with the "tip" version of NaviServer from bitbucket, but one should wait for general use until 4.99.6 is released.
- there is already some redundancy between util::http::get and util::http::post. It would be better to implement a "util::http::request -method GET|POST|..." that does the heavy lifting, and maybe convenience methods for "get", "post" etc. on top of this when needed.
- one should use the Tcl expand operator rather than "eval".
- the result of the queue_cmd is not a queue, but a handle
- without requesting a gzipped content (via adding Accept-Encoding gzip), the result will never be gzipped.
- Currently, the list of options ot post is very long and not orthogonal. the data of the post request is either attribute/value pairs, or multipart variants "datas" or "files" if i see this correctly. I think, it would be conceptually nicer to have a "-data [util::http::data ... ]" which passes the raw data to the request. In many cases, "-data [form_vars -form ....]" will be sufficient, when the default encoding is set depending on data provided and multipart. Allowing a user to specify a raw data is certainly useful (e.g. for put requests, dav*, etc.)
- i am not sure, that the many ways specifying variables is needed (it should not part of "post" or "request".
- the "ns_zlib uncompress" does not a gunzip, the proper tcl command should be "zlib gunzip"; in case no decompressor is available, an error should be raised.
Collapse
15: Re: util_httppost (response to 13)
Posted by Antonio Pisano on
I've sent you the new version by mail because I don't want to clobber the whole forum...

Let me know

All the best

Antonio