Forum OpenACS Development: caching ns_sets

Collapse
Posted by Caroline Meeks on
I noticed this code today...Does anyone know how to cache ns_sets correctly? Cachng this query would be a big performance help for dotLRN.
ad_proc -public list_users {
        {-rel_type dotlrn_member_rel}
        community_id
    } {
        Returns the list of users with a membership_id, a user_id, first name,
        last name, email, and role.

        AKS: uncaching this until we figure out how to cache ns_sets correctly
    } {
        return [dotlrn_community::list_users_not_cached \
            -rel_type $rel_type \
            -community_id $community_id
        ]
    }

    ad_proc -private list_users_not_cached {
        {-rel_type:required}
        {-community_id:required}
    } {
        Memoizing helper
    } {
        return [db_list_of_ns_sets select_users {}]
    }
Collapse
2: Re: caching ns_sets (response to 1)
Posted by Ola Hansson on
How about this? Not extremely efficient perhaps, but you wouldn't have to change any other code...

ad_proc -public list_users {
        {-rel_type dotlrn_member_rel}
        community_id
    } {
        Returns the list of users with a membership_id, a user_id, 
first name,
        last name, email, and role.

        AKS: uncaching this until we figure out how to cache ns_sets 
correctly
    } {
#        return [dotlrn_community::list_users_not_cached \
#            -rel_type $rel_type \
#            -community_id $community_id
#        ]

        set users_list [util_memoize "dotlrn_community::list_users_not_cached \
            -rel_type $rel_type \
            -community_id $community_id
        "]

        # Convert to a list of ns_sets

        set ns_sets [list]

        foreach kv_pairs_list $users_list {
            lappend ns_sets [util_list_to_ns_set $kv_pairs_list]
        }

        return $ns_sets
    }



    ad_proc -private list_users_not_cached {
        {-rel_type:required}
        {-community_id:required}
    } {
        Memoizing helper
    } {
#       return [db_list_of_ns_sets select_users {}]

        set result [list]

        foreach ns_set [db_list_of_ns_sets select_users {}] {
            lappend result [util_ns_set_to_list -set $ns_set]
        }

        return $result
    }

Disclaimer: I have only tested it in parts...
Collapse
3: Re: caching ns_sets (response to 1)
Posted by Ola Hansson on
Maybe adding a bolean (optional) "-persist" flag to db_list_of_ns_sets would be a good thing. That way you could make it call "ns_set copy -persist $selection" internally, or without -persist if you leave out the flag.

list_users_not_cached should also take the optional -persist flag, and the flag should be provided when it gets called from a memoization statement.

We would be able to do without my hack above if we implement it this way, I think.

The persistent ns_set needs to be freed at some point, but I haven't thought about that part so much yet... Some help here would be appreciated😊

Collapse
4: Re: caching ns_sets (response to 3)
Posted by Jeff Davis on
You can't cache ns_sets with ns_cache even with the -persist flag. ns_sets are thread local and the ns_cache used by util_memoize is not. Even if you use a thread local cache there is no way to free the ns_set if the handle gets flushed. And besides, you don't want to cache this on a per thread basis. Anyway, I think unless there is some really good reason to use ns_sets (and I don't see one having looked at how this is being used), I think you would be better off caching a list of lists anyway.
Collapse
7: Re: caching ns_sets (response to 4)
Posted by Dan Wickstrom on
Jeff, you're statement about ns_sets is slightly incorrect.  ns_sets created with the -persist flag are not thread local.  The handle for a ns_set created with the -persist flag is stored in a static hashtable that is globally accessible.  Access to persistant sets is protected by a global mutex, so there shouldn't be a problem with accessing it from multiple threads.
Collapse
9: Re: caching ns_sets (response to 7)
Posted by Jeff Davis on
My mistake.  Although I still think it's better to put the data
into a list and ns_cache for that although I guess if
it is indeed persistent you could put the handle in an
nsv or in an unlimited size/time ns_cache (keeping in
mind that there would then be no mechanism to flush
the cache other than by hand).
Collapse
5: Re: caching ns_sets (response to 1)
Posted by Jonathan Ellis on
you can also cache tcl arrays with [array get|set]...
Collapse
6: Re: caching ns_sets (response to 1)
Posted by Tom Jackson on

Jeff,

That is interesting, I use ns_set -persist in a startup script in AOLserver3.4, and it seems to work fine in all threads. I think I use it in init.tcl, so maybe it happens before thread creation?

Of course you still have to use the old ns_share, or new nsv_* to use it later on.

If I had to write this code over again, I would use nsv_array instead of ns_set, unless I needed multiple values per key.

If ns_set is always thread local, that seems like a mistake if the -persist flag is to be of any use. Ns_set offers some unique features, and is almost as fast as nsv arrays, at least for small to medium set sizes. But maybe the problem is one structure accessed by multiple threads with no protection?

If you are going to be looping over the set members, nsv is still efficient, because you can make an copy of the array so you only lock a mutex once pre copy.

Maybe the cacheing procedures could create an array, bulk copy that array to an nsv_array in a single step, and then return a copy of this nsv_array when needed.

Collapse
8: Re: caching ns_sets (response to 1)
Posted by Ola Hansson on
Jeff,

How about using db_multirow instead of db_list_of_ns_sets in "list_users_not_cached", and then call template::util::multirow_to_list (I found a bug in this one - see below)? "list_users" could return a memoized copy of that list of lists in array settable format.

If we don't want to hack all the scripts that call "list_users" we could transform the cached list into a list of ns_sets to be returned for now, no?

There does not appear to exist any db api that directly returns a list-of-lists-in-array-settable-format, and that's a shame. Or is it just escaping my eyes...?

Bug in template::util::multirow_to_list:

ad_proc -public template::util::multirow_to_list { name } {
    generate a list structure representitive of a multirow data source

    @param name the name of an existing multirow data source

    @return a representation of a multirow data source as a list,
    suitable for passing by value in the form { { row } { row } { row } ... }

    @see proc template::util::list_to_multirow
} {

  upvar $name:rowcount rowcount

  set rows [list]

  for { set i 1 } { $i <= $rowcount } { incr i } {

    #upvar $name:$rownum row
    upvar $name:$i row
    lappend rows [array get row]
  }

  return $rows
}
("upvar $name:$rownum row" should be "upvar $name:$i row")

I think Jeff wrote a page about how to make proper patches some time ago, if I'm not mistaken. It would be great if someone can point me to it so I can begin to submit such patches. Thanks.

Collapse
10: Re: caching ns_sets (response to 8)
Posted by Jeff Davis on
that is exactly the format I think is better :)

The patch instructions are at:
https://openacs.org/bugtracker/openacs/patch-submission-instructions.html
(although I guess I should update it since I left out that
diff -uN does not always work since solaris only has diff -c
style context diffs).

Collapse
11: Re: caching ns_sets (response to 1)
Posted by Don Baccus on
I think I really need to work on getting my caching db_* API cleaned up and into the tree before too long ...

Berkeley School of Music uses it, too (and added some enhancements).

The OF folk use ns_sets extensively in dotLRN.

My code doesn't typically, in particular Greenpeace code, but I did implement db_multirow -cache and db_list -cache and I think db_list_of_lists -cache ...

I really like the idea of caching right in the db API ...

Collapse
12: Re: caching ns_sets (response to 1)
Posted by Jun Yamog on
Hi Don,

Will db_list_of_ns_sets having caching too?  I was hoping it will.  Originally I tried to use db_list_of_lists since there are something that db_multirow is not good at.  But then due to some limitation of db_list_of_lists I ended up making a patched version.  I then decide to drop the patched up db_list_of_lists in favor or stock db_list_of_ns_sets.  So I am using db_list_of_ns_sets which fits the bill perfectly and I am assuming my caching issue will be solved one you get your caching db_*.  Will db_list_of_ns_sets have a caching feature in the future?