Forum OpenACS Development: rp_serve_abstract_file and PerformanceModeP

I wrote an index.vuh that behaves much like the one in ETP, in that it
does some processing, stuffs it's results in a connection specific
datastructure and calls "rp_serve_abstract_file $template_path" at the
end.

The template at $template_path relies on the connection specific data
to be present, assuming that the index.vuh has been processed before.

The code worked fine until I moved it to another server, where most
page calls for the same url that followed shortly after the first one
failed because the connection specific data was not present. The
reason for this turned out to be that PerformanceModeP (acs-kernel)
was set to 1 on that other server, which does the following according
to the parameter's description:

"Setting this to 1 will tell the request processor to make the
assumption that once a url is mapped to a file, that mapping never
changes. This obviously would cause problems on a development system,
but will improve performance on a production server."

So I wonder, if I (and ETP) use the rp_serve_abstract_file wrongly, by
assuming that the page it is in is never skipped or if that is a bug
in the request processor.

Collapse
Posted by Don Baccus on
So I wonder, if I (and ETP) use the rp_serve_abstract_file wrongly ...
A question for the implementors of ETP (since that's where you got the idea to abuse rp_serve_abstract_file). Here's the header for rp_serve_abstract_file:
ad_proc -private rp_serve_abstract_file
Which part of "-private" don't folks understand?

In general the request processor is one of the best bits of the core when it comes to being annotated with "-public", "-private", and "-deprecated". Those annotations are there for a reason. If there was a simple way to enforce "-private" in the toolkit we would.

Collapse
Posted by Dave Bauer on
Don, I notice this example in the documentation:
 Here is an example index.vuh file that you can adapt to your own purposes:

# Get the paths

set the_url [ad_conn path_info]
set the_root [ns_info pageroot]

# Get the IDs
set content_root 
  [db_string content_root "select content_item.get_root_folder from dual"]
set template_root 
  [db_string template_root "select content_template.get_root_folder from dual"]

# Serve the page
if { [content::init the_url the_root $content_root $template_root] } {
  set file "$the_root/$the_url"
  rp_serve_abstract_file $file
} else {
  ns_returnnotfound
}
The index.vuh is there to allow you to programatically map URLs to content. So it looks like the CR developers did not check with the RP developers. It looks like we will need to work out something that can accomplish the same thing without breaking the rules.
Collapse
Posted by Don Baccus on
Tilmann pointing me to code in the CMS that does this, too.  Grrr.

I've pointed Tilmann in a direction that might work.  I expect he'll let me know very soon now :)

Collapse
Posted by Dan Wickstrom on
That's odd.  I duplicated the cms index.vuh functionality in the CR, so that CMS didn't need to be loaded to view CR content.  I'm surprised that it's documented.
Collapse
Posted by Lars Pind on
Would the same problem occur by using rp_serve_concrete_file? Any reason for not using that?

What I've used in a similar case is:

ad_conn -set file "[acs_root_dir]/packages/lars-blogger/www/archive-display.tcl"
adp_parse_ad_conn_file
Don't know if this has similar side-effects.
Collapse
Posted by Don Baccus on
The direction I pointed Tilmann in is similar, it's a snippet of code buried in the "ad_return_exception_template" proc I added last spring that allows one to return a templated error page.

The idea's the same as your example, i.e. parse and return the template.

Collapse
Posted by Tilmann Singer on
The adp_parse_ad_conn_file calls

[template::adp_parse [file root [ad_conn file]] {}]

which also results in errors when PerformanceMode is switched on. It is skipping the index.vuh and results in a not found error when calling the same url repeatedly. This is different from the error I saw first where PerformanceMode would skip the index.vuh but call the same template afterwards.

Executing the snippet from within ad_return_exception_template as suggested by Don works:

ns_return 200 text/html
    [template::adp_parse [template::util::url_to_file $template_path
    [ad_conn file]] ""]

(this is without the query parameter passing stuff).

In the thread that this originates from we were wondering if there are efficiency issues with the code above, e.g. does it result in more string copying than there would be normally when calling an adp/tcl pair directly?

Or could that become the recommended approach for the problem: how to dynamically decide which tcl/adp pair to serve from an index.vuh?

Collapse
Posted by Don Baccus on
It's a bit faster than the standard ad_parse_ad_conn_file proc, because it doesn't first set a local then check the resulting mime type and preamble.

It probably should, though ... currently the mime type is always returned as text/html.  I'm assuming the notion is that the templating system could eventually return other mime types using a different template rendering engine, that kind of thing.

This should become the "standard" way to do this kind of template dispatching I think, certainly the current method used in the CMS and ETP don't work, as you've proven.

Collapse
Posted by Lars Pind on
I had a plan for creating an "ad_internal_redirect" command, which would do whatever's needed to tell the request processor to return some other page. Perhaps you could expand this solution to do just that, i.e. it should let you return a tcl/adp page, a plain tcl page, a pure adp page, an image, or whatever.

Another thing that I thought I'd combine with that was an "ad_form_put" command, which would stuff variables into the global _ns_form ns_set so that you could use ad_page_contract to read them in the page that you internally-redirect to.

What's your thought on this?

/Lars

Collapse
Posted by Dave Bauer on
Lars,

That sounds like a good solution that will help out with all these issues.

Lars, if you could find a way to do that would be awesome!

Especially an ad_internal_redirect would be very useful I think, if it provides a cleaner way to do this without moving around the full page result in the code like above (not knowing if it has any performance implications). I guess it's necessary to build a hook for that into the request processor, or do you see a possibility to do without?

ad_internal_redirect should in any case make sure that the page it is being called from does not get skipped under PerformanceMode. And it should be possible to call any possibly target, as you said. Specifically with the template::adp_parse solution above I am very happy that it is possible to specify an tcl/adp pair, but further dispatch to yet another adp by doing a "ad_return_template $some_other_adp" within the tcl file.

Collapse
Posted by Lars Pind on
Heh, I was hoping that now that you were already so entangled in this issue, you'd do it instead :)

/Lars

The only thing I can offer is to wrap the above call to adp_parse into a proc, with an optional passing of query parameters.

If it involves changing the request processor (is that what you had in mind?) then it's beyond my capabilities ...

Collapse
Posted by Don Baccus on
I see what's causing the problem in rp_serve_abstract_file ... it's the ad_conn -set file $path line.

This explains why Lars's suggestion which included setting the file failed, though I would've expected the failure behavior to be identical.  Hmmm...I'm not going to track that mystery, though.

The request processor calls rp_serve_abstract_file then sets the file and path_info parts of the URL map to their current ad_conn values.  This URL map is used directly in performance mode to go directly to the correct file rather than do the site node work.

Here's the sequence:

1. rp_handler calls rp_serve_abstract_file for "index.vuh"

2. rp_serve_abstract_file sets ad_conn file to the path to "index.vuh"

3. it then calls the file executing "index.vuh"

4. you call rp_serve_abstract_file with a new file

5. rp_serve_abstract_file sets ad_conn file to this new file path

6. it finishes

7. index.vuh finishes

8. our first rp_serve_abstract_file finishes

9. rp_handler sets the url map to the current ad_conn file value, which was set in step 5 - oops!  Performance mode dies.  Not always because you probably hit different threads on the server rather than the same one over and over, etc.

This tells me that a boolean switch to rp_serve_abstract_file which tells it not to modify the ad_conn file value would do the "ad_internal_redirect" action.

At least I think this is true.  Give it a whirl if you have time, Tilmann, and report back!

If we add this switch and remove "-private" and document the proper way to call it to avoid problems in performance mode then I have no problem with our using it in this way...assuming Tilmann's tests show that it works, that is.

Preventing rp_serve_abstract_file from setting ad_conn file won't work, because at least the handler that gets called for tcl/adp files seems to rely on that setting to point to an actual file.

All that rp_serve_abstract_file actually does is trying to map a concrete file to the url, set the result in ad_conn file and call rp_serve_concrete_file.

rp_serve_concrete_file then looks at the extension of the concrete file and decides which handler to call that actually processes the file (or returns the file itself when there is no handler registered for that type). In the case of a .tcl file the handler is adp_parse_ad_conn_file, which mainly does this: [template::adp_parse [file root [ad_conn file]] {}] and ns_return's the result (like the working suggestion from above).

What do you think of the following approach for creating ad_internal_redirect: introduce a way to tell the rp_handler to skip step number 9, so that it never caches file-url mappings when we don't want it to? This could be done by doing something like "ad_conn -set skip_url2file_cache_p 1" before calling rp_serve_abstract_file, and modifying rp_handler accordingly.

Collapse
Posted by Don Baccus on
Your right, Tilmann ... and your approach will probably work.  Another thing to look at might be rewriting ad_parse_ad_conn_file (renaming it, too?) to take an optional file parameter that overrides its use of ad_conn file.

Since you're digging into this I'll let you dig around a bit more and tell us which approach ends up looking cleaner.  That's the bottom line IMO, let's keep it as kludge-free as we can.

It can be so easy ... Don suggested on IRC to just remember the value of ad_conn file, call rp_serve_abstract_file and restore ad_conn file afterwards, so here is the first attempt at an ad_internal_redirect function:

https://openacs.org/sdm/one-patch.tcl?patch_id=328

I added the capability to enter relative paths by prepending the directory of ad_conn file when there is no / at the beginning of the path to redirect to.

It does not touch the page parameters, so everything passed to the first page will be preserved and you can use ad_page_contract in either one. Having an ad_form_put as Lars suggested above would still be nice though.

It seems to follow nested redirects, e.g. you can call ad_internal_redirect in a file you just redirected to. I tried a circular redirect which resulted in a server crash - not a big problem I guess, since you should never need more than one redirect in practice.

Well, circular redirects shouldn't crash the server, but I don't know how to detect them elegantly. The only way I can think of is to put some kind of counter into ad_conn that gets incremented with each call of ad_internal_redirect and abort when it exceeds a specific number.

BTW, the stacksize of the server that crashed was 1024*1024, so that was propably not the cause of the problem.

Collapse
Posted by Don Baccus on
You'd really have to add a recursion counter, yes.  I wouldn't have to be in ad_conn as it's not really useful in a general sense, you could declare it as a global and just share it between rp_handler (to init recursion depth to zero) and test and increment it in your internal redirector.

In practice though I'm comfortable with your just documentating that infinite recursion is not detected other than by server crash.

Collapse
Posted by Lars Pind on
Given how easy it is to implement recursion detection, it might be worth it:
global __rp_internal_redirect_recursion_counter
if { ![info exists __rp_internal_redirect_recursion_counter] } {
    set __rp_internal_redirect_recursion_counter 0
} elseif { $__rp_internal_redirect_recursion_counter > 10 } {
    error "rp_internal_redirect: Recursion limit exceeded."
}
If you place this at the top of the "rp_internal_redirect" proc, that should work, no?

/Lars

Collapse
Posted by Don Baccus on
You've certainly made it easy for Tils :)  I imagine he'll take the hint  ... won't you, Tils? :)
It's not only easy but also an honour for me to put together the pieces ... The recursion counter works (after adding incr __rp_internal_redirect_recursion_counter to actually increment it, thanks for leaving that to me Lars 😉.

Everybody should test it to see the looong beautiful stacktraces that caught recursions produce ...

BTW, how or why do globals work this way? I read somewhere else that in aolserver >2 globals are flushed upon thread-deallocation (which means end of connection i guess). Yet the tcl_url2file mapping in rp_handler uses globals as if they survive several connections, by storing the url to file mapping in a global array, which wouldn't make any sense if the array was flushed at the end of the connection. This also seems to work because otherwise the failure that caused this thread would have never occured ... so how do globals really work? Or what am I missing here?

(still wondering if the tcl_url2file mapping should not actually be in an ns_share)

i believe they special cased globals that begin with the string "tcl_" so that they persist across connections.
Collapse
Posted by Don Baccus on
Threads aren't deallocated at the end of each connection, no.  You can control how long threads live via parameters in your nsd.tcl init file.

ns_share's deprecated which is why you see nsv_* being used all over the place.  It's been ages since I read the justification for the switch in the AOLserver docs so I don't remember exactly why, though nsv_* are cool so I don't really care.

What you're suggesting is that the map be cached across all threads rather than have each thread cache their own copy.  In theory this sounds wonderful but given the time it took us to figure out why the existing aD code you copied failed and the easiest way to do internal redirection properly, switching to system-wide caching of the map makes me a bit nervous.  There might be hidden gotchas in there we don't know about.

But if you or anyone else has time to look into it with an eye towards convincing us that it would be safe, yeah, why not?

Anyway ... I'd love to see your internal redirect get into the CVS tree, and for the various abusers of rp_serve_abstract_file to get changed to use it, ASAP.  Let me know when you're done and we can talk about making the change.

Collapse
Posted by Don Baccus on
Almost right, Yon, not "tcl_" but any global prefixed with "tcl", "error" or the specific global "env" are preserved. Thanks for the tip, I dove into AOLserver sources and in about 30 seconds uncovered the documentation, in nsd/tclinit.c:
 "proc " CLEANUP_PROC " {{autoclose 0}} {\n"
        "    foreach g [info globals] {\n"
        "       if {![string match tcl* $g]\n"
        "               && ![string match error* $g]\n"
        "               && [string compare env $g] != 0} {\n"
        "           upvar #0 $g gv\n"
        "           if [info exists gv] {\n"
        "               unset gv\n"
        "           }\n"
        "       }\n"
        "    }\n"
Collapse
Posted by Roberto Mello on
Don, in AOLserver 3.4.2 (maybe others), ns_share outputs a notice (presumably by AOLserver) that says ns_share is deprecated because it causes lock contention.

I don't see that notice in AOLserver 3.3. I noticed that when I loaded my OpenACS instance on AOLserver 3.4.

Collapse
Posted by Don Baccus on
Does this mean we still have ns_shares in our (4.x) code?  Or are you running a 3.x site.  I know aD worked to move to nsv_*s but maybe some places got missed.
Collapse
Posted by Lars Pind on
I've added rp_getform, which is a simple wrapper around ns_getform, only it'll create the form if it doesn't already extist.

Then I've added rp_form_put, which will add an entry to the form.

This is useful if you want to create a shortcut to another page using an index.vuh file. For example what I just added for api-doc:

/packages/acs-api-browser/www/proc/index.vuh:

rp_form_put query_string [string range [ad_conn extra_url] [string length "proc/"] end]
rp_form_put search_type "Feeling Lucky"
rp_internal_redirect "/packages/[ad_conn package_key]/www/proc-search"

This lets you simply type

foo.com/api-doc/proc/rp_form_put

into your browser, and you'll get the documentation for this proc.

/Lars

Collapse
Posted by Roberto Mello on
Thanks Lars, that's great. I'll add this shortcut to the developer's guide.
Collapse
Posted by Lars Pind on
Excellent, thanks, Roberto.

Btw, the code got garbled above:

rp_form_put query_string [string range [ad_conn extra_url] [string length "proc/"] end] 
rp_form_put search_type "Feeling Lucky"
rp_internal_redirect "/packages/[ad_conn package_key]/www/proc-search"
/Lars