Forum OpenACS Development: rp_handler and host-node mapped subsites

If rp_handler is serving a POST request for a host-node mapped subsite, POST variables can be lost. This happens in case 2 below. The GET variables are gathered from [ns_conn request] and included in a redirection. However, POST variables are not passed on.


# 2. handle special case: if the root is a prefix of the URL,
# remove this prefix from the URL, and redirect.
if { $root ne "" } {
if { [regexp "^${root}(.*)$" $url match url] } {

if { [regexp {^GET [^\?]*\?(.*) HTTP} [ns_conn request] match vars]\
} {
append url ?$vars
}
if { [security::secure_conn_p] } {
# it's a secure connection.
...
} else {
ad_returnredirect \
-allow_complete_url http://[ad_host][ad_port]$url
return "filter_return"
}
}

Is rp_form_put the correct proc to use to pass along POST variables in this case?

Collapse
Posted by Ryan Gallimore on
rp_filter is the correct proc name - my mistake.

I was able to resolve the issue by converting ns_getform vars into a query request. This should take care of any POSTed files too.

This seems to work, but is there a better way to do it?


--- request-processor-procs.tcl 11 May 2010 23:38:17 -0000 1.102.2.1
+++ request-processor-procs.tcl 23 Jun 2011 13:03:52 -0000
@@ -536,11 +536,19 @@
set url [ad_conn url]
# 2. handle special case: if the root is a prefix of the URL,
# remove this prefix from the URL, and redirect.
+
if { $root ne "" } {
if { [regexp "^${root}(.*)$" $url match url] } {
-
- if { [regexp {^GET [^\?]*\?(.*) HTTP} [ns_conn request] match vars] } {
- append url ?$vars
+ if { [regexp {^(GET|POST) [^\?]*\??(.*) HTTP} [ns_conn request] match method vars] } {
+ if {$method eq "GET"} {
+ append url ?$vars
+ } else {
+ foreach {name value} [ns_set array [ns_getform]] {
+ append vars "[ad_urlencode $name]=[ad_urlencode $value]&"
+ }
+ set vars [string range $vars 0 end-1]]
+ append url ?$vars
+ }
}
if { [security::secure_conn_p] } {
# it's a secure connection.
@@ -548,6 +556,7 @@
-allow_complete_url https://[ad_host][ad_port]$url
return "filter_return"
} else {
ad_returnredirect \
-allow_complete_url http://[ad_host][ad_port]$url
return "filter_return"

Collapse
Posted by Dave Bauer on
I don't believe you can send a file in a GET query variable.

Why are you doing a redirect in this case?

Is someone going to http://domain.com/subsite when you want them to go to http://subsite.com ?

Collapse
Posted by Ryan Gallimore on
You can't send a file but according to the ns_getform code, when called, it copies all files to temp locations and adds the path, mime type and size to the ns_set. So by mapping the ns_getform to GET variables, this should work for files too. I have confirmed that it works for other POST vars.

RP is doing the redirect in rp_filter - for the case you mention. In survey is there is confirmation template that uses ad_conn url to retrieve the return_url. I want to preserve that core code so as to avoid forking acs-templating.

See survey/www/admin/survey-create.tcl

Collapse
Posted by Ryan Gallimore on
The confirmation template is at packages/acs-templating/resources/forms/confirm-button.*

It calls ad_conn url which returns /subsite/page even if we are operating from a mapped subsite. So the RP redirects to /page losing the POST vars when we come from a form.

Collapse
Posted by Gustaf Neumann on
Note that there are implementation limits on the maximum size of URIs in various user agents, depending on their versions. Ignoring these limits can lead to crashes, hanging requests and truncated URLs. Therefore mapping arbitrary POST requests into GET requests with query variables is not a good idea if you want to support an open set of user agents, especially, when large files might be involved.

http://classicasp.aspfaq.com/forms/what-is-the-limit-on-querystring/get/url-parameters.html
http://www.boutell.com/newfaq/misc/urllength.html

Collapse
Posted by Ryan Gallimore on
Thanks Gustaf, I'd forgotten about that. Can you suggest an alternate method of handling this case? It seems a shortcoming that the redirect only supports GET requests.
Collapse
Posted by Gustaf Neumann on
have you looked at rp_internal_redirect?
Collapse
Posted by Dave Bauer on
Ryan,

Can you explain the case you are trying to fix?

Why is the "root a prefix of the URL?" I am not sure what that means in this context.

Ie: What URL is being requested and what URL is it redirect to? Why is a redirect happening on a POST?

Collapse
Posted by Ryan Gallimore on
Hi Dave,

The case is described in the RP code above.

I have mapped a hostname called subsite1 to a subsite at /subsite1

A confirmation template in acs-templating (path above) generates a return_url from [ad_conn url] of /subsite1/page.

Based on the RP logic above, the host name matches the subsite name, so the RP attempts to replace it, redirecting the request again.

But it does this for GET requests only. For my example, I post vars to the confirmation template, and these are lost. So in this case, my page does not work at all. See /packages/survey/www/admin/survey-create.tcl for an example. In a host-node mapped subsite, this page does not create a new survey at all.

Why is a redirect happening on a POST?

That's the problem I'm trying to solve. It seems the code was only written to handle GET.

Collapse
Posted by Dave Bauer on
Fix this, it is easier!

"A confirmation template in acs-templating (path above) generates a return_url from [ad_conn url] of /subsite1/page."

Collapse
Posted by Ryan Gallimore on
We want [ad_conn url] to be used for the generic case, don't we?

And shouldn't the RP "forward" POST requests too for mapped subsites?

Collapse
Posted by Torben Brosten on
Hi,

I haven't had time to dwell in the issue here, but having worked a bunch with host node mapping and ad_conn url issues, this may be of value:

site_node::conn_url

Use this in place of ns_conn url when referencing host_nodes. This proc returns the appropriate ns_conn url value, depending on if host_node_map is used for current connection, or hostname's domain.

Defined in CVS HEAD packages/acs-tcl/tcl/site-nodes-procs.tcl

It helps set up a sane return_url etc. with form processing.

cheers,

Torben

Collapse
Posted by Ryan Gallimore on
Thanks Torben, but site_node::conn_url does not seem to return anything. Is it meant to be used directly?

Even though it is easier to fix the return url in the confirm template, I can see this issue occurring again when there is a POST to a mapped subsite page. It should be solved in the RP.

Gustaf, I see your point about converting POST vars to GET vars, but we already do this in the RP (grep [export_entire_form_as_url_vars]). File contents are not included in the URL, but written to temporary locations in the file system. Only the file properties are available in the POST vars.

I looked into rp_internal_redirect but I'm not sure how to convert a request URL like /site1/page into a physical path - is there a proc for that?

Collapse
Posted by Torben Brosten on
Ryan, site_node::conn_url only works on CVS head. If you have a prior version, it's not stable. There were modifications on dependent procedures.
Collapse
Posted by Eduardo Santos on
Hi Ryan,

I've posted this same bug here on the forums and in the bugtracker a long time ago, and got no feedback. Take a look: https://openacs.org/forums/message-view?message_id=1630910

For now, I changed the confirm_template to use ns_conn instead of ad_conn, wich just owrks. But it seems a terrible workaround for me.

I did a long investigation about the implications on RP, and I could not get to any conclusion. I tried the same pproach that you said on host node map patch, and many others. But got too many problems, such as the ones Gustaf's pointed.

We should make a joint effort to fix this problem now, because it's a nice feature on OpenACS that just doesn't work.

Collapse
Posted by Dave Bauer on
Hi.

I suggest NOT redirecting the POST, just process it, and then when the post it done, it'll return a redirect and the user can continue normally.

Even with that suggestion, I am stll unclear how this situation would arise in the first place.

That is, what is the use case where a user is at

http://subsite1 and clicks OK on a form and the POST url is http://mainsite.com/subsite1?

It appears this is when you would need to process a redirect.

If the user is at http://subsite1 the POST should go to http://subsite1 and no redirect would be necessary.

Collapse
Posted by Eduardo Santos on
Hi Dave,

One simple example: let's say you have a personalized login form that goes above the system pages. This form contain only two fields: username and e-mail.

The action for this form will be the subsite_url/register/. If you are using host node map, this link will point to subsite_url/register, but the address you are accessing does not contain the subsite_url.

On this case, when you are using the domain mapped to the host, you won't be able to login, because the POST is sent to subsite_url/register/ and RP won't be able to deal with this.

This is just an example, but in almost every form in the system this problem will happen.

Collapse
Posted by Dave Bauer on
If any code is generating a fully qualified URL for the POST action, that's a bug and that should be fixed.

Trying to redirect a POST is just fixing the symptom and not the real bug.

The POST should always go to the same domain name that the form was generated from so there is no need for a redirect.

Collapse
Posted by Ryan Gallimore on
OK, I agree. We need to alter ad_conn url* in the case of host-node mapped subsites, then. If we did that in this case, the RP redirect would not be necessary.

Any thoughts?

Collapse
Posted by Torben Brosten on
Insert the value from site_node::conn_url when ad_conn url is from a host node mapped site?

This would be the simplest solution, but doesn't work for all cases, because sometimes ad_conn url is used to build a reference only available from the main domain.

The alternative is to swapp ad_conn url with site_node::conn_url for the specific cases that do work.

Collapse
Posted by Dave Bauer on
Any references available ONLY from the main domain are also bugs.

See new embeds APM feature (i'll try to find a link to more info) which should resolve this issue by allowing search/notification and other site-wide services to be embedded in the subsite context.

Collapse
Posted by Ryan Gallimore on
How does using embed for those specific services fix the general problem when I access http://mapped/page and [ad_conn url] returns /mapped/page? Shouldn't we fix ad_conn url to show the proper URL: /page?
Collapse
Posted by Dave Bauer on
Those are two different issues. ad_conn url should return the URL you actually used to access the page.
Collapse
Posted by Torben Brosten on
It would be nice to finally fix ad_conn url for host node mapping cases.

It's been a couple of years since I've been messing with host node mapping.

Apparently site_node::conn_url isn't a direct substitute as I mentioned. I'm terribly sorry about spouting that without re-checking.

site_node::conn_url is a proc for building urls that work when ad_conn url breaks for host node mapping.

One can use site_node::conn_url to generate urls that work with or without hostnode mapping, so that a host node mapped site works from the main domain as well.

Using site_node::conn_url is really useful when the main domain has a ssl/https setup, because the site can switch back and forth between http/https without generating a security warning or changing the display templates. You can direct a user to login via https and then switch them back to http on the hostnode mapped subsite after login. The UI flow is smooth.

Collapse
Posted by Torben Brosten on

Eduardo,

Regarding the register case, take a look at universallogos.com It's mapped to dekkasupply.com/universallogos Login and Register work as expected from universallogos.com

The return_url is specified by ad_get_login_url

In this particular case, the return_url doesn't redirect back to the hostnode mapped domain, because many services break in that context. Instead, it serves the subsite via the main domain.

The login template etc are the default ones.

login.tcl has this slight change, to permit redirects back to the hostnode site:

<     if { [util::external_url_p $return_url] } {
<       ad_returnredirect -message "only urls with a valid host name are permitted" "."
---
>     set locations_list [security::locations]
>     # there may be as many as 3 valid full urls
>     set external_url [util_complete_url_p $return_url]
>     foreach location $locations_list {
>         set external_url [expr { $external_url && ![string match "$location/*" $return_url] } ] 
>     }
>     if { $external_url } {
>       ad_returnredirect -message "only urls without a host name are permitted" "."
Collapse
Posted by Ryan Gallimore on
I'm looking for suggestions to solve this issue. Is anyone available to lend a hand?

The problem affects all POST requests to a host-mapped subsite.

Collapse
Posted by Dave Bauer on
I still dispute there is a bug in the request processor. I run sites with host-node mapped subsites, processing forms every day. I have never seen this behavior.

The action URL for the form should be to the host node mapped URL so there should not be any problem.

Please provide a code sample where the form action goes to the main site instead of the URL of the request.

Collapse
Posted by Dave Bauer on
Ryan,

Fix the confirm template to use the correct url. That seems like a simple fix that will solve your problem.

Collapse
Posted by Ryan Gallimore on
The confirm template has fixed the problem for survey, but not for the general case.

For example, on a host-mapped site in file-storage, go to modify permissions.
The standard permissions table is visible.

form action="/mapped_subsite/permissions/perm-modify" method="post"

The RP subsite hack can't process the post so we get the error:

We had some problems with your input:

You must supply a value for object_id
You must supply a value for return_url

When I try the same thing using the main site URL it works as expected.

Dave can you try to reproduce this?

Collapse
Posted by Dave Bauer on
So any code that's generatng a POST url explicitly instead of using a realtive URL or a self-submit, is suseptible to this.

I don't have time right now to fix it, but it looks like any call to ad_conn url will break.

Collapse
Posted by Ryan Gallimore on
That's what I'm seeing.

Pinning down where to set ad_conn url is tricky, so if you get a chance, any help would be appreciated.

Collapse
Posted by Torben Brosten on
As long as you're staying in the same directory, consider converting to a relative reference that's within the same directory by removing any leading forward slash references (and what's between them).

for example,

if ad_conn url returns:

/index.tcl

convert it to:

index.tcl

This doesn't work if you're swapping around in the file hierarchy, but should work fine for the simple cases.

Collapse
Posted by Eduardo Santos on
Hi Dane, Torben and everyone,

I've done a lot of investigation on this problem, and yes, basically any call the ad_con will eventually fail in a host node mapped subsite.

The fix is tricky befure the call to ad_conn -set is done AFTER the filters are proccessed. So, even if we fix the filters (rp_filter) and fix the redirect, ad_conn -set won't see this change.

To fix it would be a big problem, because we would have to change the exact place in request-processor where the ad_conn -set is called. Ni first looking it would be a major rewriting of request processor code.

Mayvbe somebody with better knowledge on RP can propose a better solution.

Collapse
Posted by Torben Brosten on

Ryan, Eduardo,

I agree, openacs-core things break in host-node mapped sites. Many times, it's because a package is limited to a single mount, often outside of the host node mapped file hierarchy. These should be re-worked to handle more general cases.

I guess what I'm saying is that the rp_handler is not the place to be throwing form posts. Use it to generate urls that work in the first place.

Most apps reside within a single directory. By omitting all forward slashes, browsers use existing directories. The case is simple for all these host node map directories.. no change required. For example, take a look at birdswelcome.com All the forms in that site use post. All are in a single directory. No issues, even when accessing via dekkasupply.com/birdswelcome

For apps that require processing between a hostnode-mapped domain and the main-domain, such as login and register, build the post and return_url ahead of time using absolute references.

site_node::conn_url is helpful. It is especially useful in a place like index.vuh in a hostnode-mapped directory.

Here's an example from universallogos.com/index.vuh

set conn_url [site_node::conn_url]
set conn_url0 [ad_conn url]
ns_log Notice "/www/universallogos/index.vuh conn_url = '${conn_url}', ad_conn url = ${conn_url0}"
if { $conn_url eq "" } {
    regsub -- {/universallogos/} $conn_url0 {/} conn_url
}

if { [string match "/resources/*" $conn_url ] } {
    set redirect_url "/www${conn_url}"
    rp_internal_redirect $redirect_url
    ad_script_abort
} elseif { [string match "/search/*" $conn_url ] } {
    set query [ns_conn query]
    set redirect_url "http://dekkasupply.com${conn_url}?${query}"
  ns_log Notice "/usr/local/www/dekkasupply.com.com/openacs-4/www/universallogos/index.vuh: redirecting to: ${redirect_url}"
    ad_returnredirect -allow_complete_url $redirect_url
    ad_script_abort

} else {

    ::xowiki::Package initialize -ad_doc { 
 
        The script uses an XoWiki page as root page 
        of the site. Here, the start page is /xowiki/ followed 
        by the actual URL, as specified as the value after "-url" below. 
        Replace this value, in case a different XoWiki instance 
        name should be used. 
   
        @author Gustaf Neumann (gustaf.neumann@wu-wien.ac.at) 
        @creation-date July, 2006 
        @cvs-id $Id: index.vuh,v 1.5 2006/09/15 16:45:00 gustafn Exp $  
    } -parameter { 
        {-m view} 
        {-folder_id:integer 0} 
    } -url /universallogos/xowiki${conn_url}

    ::$package_id reply_to_user [::$package_id invoke -method $m] 
    ad_script_abort 
}

cheers,
Torben

Collapse
Posted by Ryan Gallimore on
I see your point, Torben. But not all POSTs are made to the same directory. See my permissions example above.

The problem, as Eduardo pointed out, lies with when to set ad_conn url, and the hack for mapped subsites that is already in place in rp_filter.

# 2. handle special case: if the root is a prefix of the URL,
# remove this prefix from the URL, and redirect.

Collapse
Posted by Torben Brosten on
Ryan,

Perhaps you missed the second point in that post.

It begins with: "For apps that require processing between a hostnode-mapped domain and the main-domain"..

Also, see my previous post to Eduardo regarding building return_url in ad_get_login_url.

You should be able to build a proc that returns the absolute reference you need, without messing with the request processor or redirecting.

Are you trying to hide a form processing url from the web? If so, put the form processing in an index.vuh page. No redirect required.

Collapse
Posted by Eduardo Santos on
Hi Torben,

Thanks for the reply. I understand your suggestion and I've implemented it sometimes. However, it's an workaround and it doesn't solve the real problem.

The case is: you should point an URL to /subsite/page and it should just work on host node maped subsite. Try to map all the places in the system where this call appears is very difficult, and it can always lead to some fail.

Take a look at this link: http://fisheye.openacs.org/browse/OpenACS/openacs-4/packages/acs-tcl/tcl/request-processor-procs.tcl?r=1.105#to542

Somebody knew about this problem and fixed it for GET requests, but it stays for POST requests.

As you can see, ad_conn -set comes a little bit later: http://fisheye.openacs.org/browse/OpenACS/openacs-4/packages/acs-tcl/tcl/request-processor-procs.tcl?r=1.105#to631

I've tried a lot of different patches without success. What I've been doing until know is try to fix the symptom, not the disease. I would like very much to see a real fix for this problem, but I couldn't find any yet.

Collapse
Posted by Torben Brosten on
Hi Eduardo,

"The case is: you should point an URL to /subsite/page and it should just work on host node maped subsite. Try to map all the places in the system where this call appears is very difficult, and it can always lead to some fail."

Between Ryan's exampe, and yours, I do believe I'm understanding the issue now. I still think the fix is to have the code provide the correct url.

ad_conn url is an incomplete solution that breaks expected behavior on host-node mapped sites.

Given the complexity of the rp and the depth of the fix that is required, the appropriate fix, to me, seems to be to replace the instances of ad_conn url with a proc that has predicted behavior, ie returns ad_conn url for mainsite, and a modified url for the host-node mapped sites (when serving pages from there). site_node::conn_url *was* supposed to help support creating a proc like this, but it *is* broken at the moment.. I'll see if I can re-create the proc as a replacement for ad_conn url as originally intended.

cheers,

Torben

Collapse
Posted by Dave Bauer on
I still maintain the correct fix is to stop trying to mix URLs outside of a host-mapped subsite with those in the subsite.

To fix this you need to update packages such as notifications, search, and any other to handle all the facilities using the EMBEDs feature. This is very easy to do, but of course, I am not going to explain it here :) Ok I'll give the short short version. You just edit the APM dependency settings to embed the existing package. Any web accesible files you want to embed go into package-key/embed/ directory and they are avaiable under /my-package/search (for example). That should be it. It finally solves the issue of keeping context for these site-wide services.

Then what you want to do is embed these into your subsite package.

If your subsite has unique behavior you should create a new subsite that inherits from the acs-subsite package and put your custom code in the new subsite package.

Seriously. If you can upgrade to OpenACS 5.7 your life will be much easier.

Collapse
Posted by Ryan Gallimore on
Dave, how does embedding help when ad_conn url will still return incorrectly? Eduardo's links identify the hack that detects the subsite "root" of the URL and handles GETs but not POSTs.

What if the page I am posting to is a lib file inside acs-subsite? I can't very well embed acs-subsite inside (as in my case) a package that extends it.

I don't see getting the proper URL as unique behaviour.
Calling ad_conn url while in http://mapped_host/
should return "/" not /subsite/

Collapse
Posted by Dave Bauer on
I agree. Use the correct URL. Fix whatever code is returning the wrong URL. I can't see any case where ad_conn URL returning something that is outside the host mapped site makes any sense.
Collapse
Posted by Ryan Gallimore on
In OCT it was decided that the erroneous URLs would be fixed instead of the RP. This is most likely MUCH easier.

These URLs don't satisfy the redirect condition in the RP so the POST is processed.

For the permissions admin include, here is an example of a modification on oacs-5-6 branch:

Index: www/permissions/perm-include.tcl
===================================================================
RCS file: /cvsroot/openacs-4/packages/acs-subsite/www/permissions/perm-include.tcl,v
retrieving revision 1.11
diff -u -r1.11 perm-include.tcl
--- www/permissions/perm-include.tcl 4 Jan 2008 16:34:01 -0000 1.11
+++ www/permissions/perm-include.tcl 17 Aug 2011 17:38:00 -0000
@@ -61,9 +61,7 @@
display_template {input type="checkbox" name="perm" value="@permissions.grantee_id@,remove"}
}

-
-
-set perm_url "[ad_conn subsite_url]permissions/"
+set perm_url "[subsite::get_url]permissions/"

if { ![exists_and_not_null user_add_url] } {
set user_add_url "${perm_url}perm-user-add"

Collapse
Posted by Torben Brosten on
Ryan,

So.. ad_conn url should be replaced with subsite::get_url for all/most cases, right?

How nice it is to see the functionality already available, if only buried in obscure jargon.

cheers,

Torben

Collapse
Posted by Torben Brosten on
I see now.. a direct replacement will still need the tail portion of ad_conn url, if any.
Collapse
Posted by Torben Brosten on
Okay, site_node::conn_url is fixed on cvs head.

It now is a replacement for ad_conn url as previously suggested.

Let me know if there's any issues with it.