Forum OpenACS Q&A: Clustering parameter caching issue

Collapse
Posted by Khy H on
We have configured a cluster set with two members and ran into a scenario where updating package parameters in one node does not result with the parameter values updating in the second node.

Example below:
==========
Both cluster members are restarted and Demo instance package parameter "foo" value is "bar"

Step 10: Node 1: Application code retrieves parameter "foo" by calling parameter::get. This returns "bar"

Step 20: Node 2: Using /admin/site-map and updates the demo's "foo" parameter to "NOT bar".

Step 30: Node 1: Application code retrieves package demo's parameter "foo" by calling parameter::get . This returns "bar", expecting "NOT bar"

===========
In Step 20, the parameters are updates via the parameter::set_value. The last line of the parameter::set_value procedure calls ad_parameter_cache to delete the cache entry. In the ad_parameter_cache procedure, the delete section notifies the cluster member only if the parameter entry exists in the local nsv_dict. In scenario above (Step 20), Node 1 does not get the signal to remove the cache. We removed the local cache check to address the issue. Is this an appropriate fix? Thank you

Collapse
Posted by Gustaf Neumann on
I am just finishing some refactoring work and the posture page as discussed at the OpenACS conference and will make a setup to reproduce the results. Thanks for the good explanation, i will get to this soon.
Collapse
Posted by Gustaf Neumann on
Hi Khy H,

I was able to reproduce the issue, where a cluster node writes a new parameter value without having read its parameter value via the API before. The change is committed to the repository. Please double-check.

Many thanks for the good report.

Collapse
Posted by Jonathan Kelley on
Hi Gustaf,

I tried applying the patch in this commit to an existing 5.10.1b4 version of the acs-tcl package and found that it was not updating the cache properly for a custom package parameter.

Here is the commit I patched:
https://github.com/openacs/openacs-core/commit/c0a1cf7b98866f00608634b4df718c754f89c784

I don't see any errors in the log files and it appears to propagate the message to the cluster nodes but even the node that the parameter was changed on didn't update the cache.

If I modify the code only slightly the cache is updated properly. Here is the minor update I made along the same lines as Khy's initial submission.

@@ -515,9 +515,7 @@ if {[ns_config "ns/parameters" cachingmode "per-node"] eq "none"} {

} {
if {$delete_p} {
- if {[nsv_dict exists ad_param $key $parameter_name]} {
- ::acs::clusterwide nsv_dict unset ad_param $key $parameter_name
- }
+ ::acs::clusterwide nsv_dict unset ad_param $key $parameter_name
acs::per_request_cache flush -pattern acs-tcl.ad_param-$key
return
}

Any suggestions?

Collapse
Posted by Gustaf Neumann on
Jonathan,

nsv_dict unset triggers an error, when the flush happens for a different key (mostly package_ids), but not for different dict-key. If you try to flush for a new parameter of a new package, the exception is raised.

See:

% nsv_dict set ad_param 1001 k1 v1
k1 v1
% nsv_dict unset ad_param 1001 k2
k1 v1
%  nsv_dict unset ad_param 1000 k2
no such key: 1000
Have you cherry-picked the single change? Then you are probably missing the change [1].

all the best
-g

[1] https://cvs.openacs.org/browse/OpenACS/openacs-4/packages/acs-tcl/tcl/defs-procs.tcl?u=20&r1=1.81.2.30&r2=1.81.2.31

Collapse
Posted by Jonathan Kelley on
Gustaf,

Yes, I did cherry pick that single change. This additional commit resolved the issue I was seeing.

Thanks,
Jon

Collapse
Posted by Gustaf Neumann on
Great. Another instance of "silent catches considered harmful". After the release of OpenACS 5.10.1, we should clearly reduce its occurrences in the code base.