Main points covered were what sort of test data we should use and what the current bottlenecks are. Don says that permissions are the most critical issue (especially when there are a lot of groups and relsegs). He is investigating getting rid of a number of the UNION-based views which seem to create the most problems.
Also, Don is writing a data population script which he will provide when finished (this week probably), which will cover users, groups, relational segments, subsites, and file storage contents.
Jeff will write a script to generate statistics on current sites in order to establish what the data set we should be tuning for looks like for the relevant tables.
There was discussion of timing and who might be interested in working on dotLRN testing. Janine suggested we speak to Bart Teeuwisse about possibly leading the technical side and Jeff mentioned that Til Singer involved as well would be a good idea since the work he has done on tclwebtest would be particularly useful for building automated tests.
Also, we discussed getting a testing machine set up. Janine has asked Al Essa and he says they should have Solaris box available for testing (for Oracle mostly, we anticipate using the HUB servers for postgres testing).
We discussed the merits of using AOLServer 3.5.x v 4.0 v sticking with 3.3+ad13. The AOLServer team has decided that they will not release a 3.5.2 version incorporating the ad13 i18n patches which means either we will need to stick with what we have or move the patches ourselves (they might release a 3.5.2 version with the patches if we did the work, Jeff will ask).
An alternative is to move to 4.0. As it turns out, the 4.0 beta should be released shortly, as per the AOLServer chat 2003-01-09:
Shmooved (7:32:55 PM): i would recommend to all waiting another week or so until we release a "blessed" 4.0 beta GizmoBeastLives (7:31:19 PM): Practically though, you are talking about a period of several months before 4.0 final, right? Shmooved (7:31:19 PM): there really aren't any good 4.0 docs yet Shmooved (7:31:28 PM): hopefully not that long Shmooved (7:31:41 PM): jim wants 4.0 installed here at aol in the next month or so
Since this seems so immenent, both Don and Jeff have said that they will build and test install with 4.0. We need to ascertain what i18n support will be in the 4.0 release and whether we can simply move to 4.0 as the supported platform. There was a consensus that if AOL moves to it internally on the schedule they have mentioned that it will likely be production ready for a dotlrn 1.0 release, and certainly by the time i18n feature would be mandatory for dotLRN.
We discussed Decision Making and Communication for the TAB group. We looked at the following for models for how we should run things internally and communicate to others our findings:
Also of interest is the W3C Technical Architecture Group, its charter, and the W3C WG process guidelines.
After some discussion, Don working on synthesizing (cutting and pasting) those documents for our use.
There was a discussion of workflow design. Lars has asked for comment on the design and Don is interested in how Roles will be defined and used as well as whether or not relational segments will be used. The discusssion will be carried on on the forums. Lars announced they were working on it and plan to post weekly updates on the work there as well. [ project page ]
Jan 14 17:18:55 <donb>
Anything else folks are dying to see populated other than users, groups, subgroups? Jeff mentioned site-nodes, I could make this generate tons of subsites and then groups for them etc
Jan 14 17:19:25 <jcdldn>
There is a comment in the code saying startup a problem for sloan since they have ~9000 nodes
Jan 14 17:21:59 <donb>
It's permissions that's the bottleneck, though of course one can write stupid queries that, say, count all replies to all threads rather than maintain the data denormalized that slow down with lots of objects
Jan 14 17:22:22 <donb>
(this is a huge problem with forums, one that I solved for SSV1 18 months ago)
Jan 14 17:22:20 <jcdldn>
I have not really seen problems with lots of objects, I did create about 1mm and things held together. Lots of users is an issue though since cc_users is slow even with the fix I found.
Jan 14 17:23:55 <donb>
Maybe Dan will get motiviated to write some populate scripts for the CR, too, with the same motivation ...
Jan 14 17:24:26 <danw>
Well maybe.
Jan 14 17:24:43 <danw>
I've been thinking of a project related to cr, so it might tie in.
Jan 14 17:24:50 <donb>
OK so for dotLRN we really need to concentrate on testing. Scalability is an issue but I think we need to scale our expectations in this direction
Jan 14 17:25:45 <donb>
Heidelberg will be working with the OpenACS HEAD since they need internationalization and that's where my permissions hacking etc will go ... perms in 4.6 are a lost cause beyond relatively modest numbers (i.e. a few thousand, as opposed to Jeff's 50,000 users wish-figure)
Jan 14 18:16:39 <donb>
We could really use some tclwebtest scripts to do some intellegent load-testing of OpenACS/dotLRN - my populate scripts will help generate big DBs but then you want to pound on it
Jan 14 17:48:04 <danw>
Maybe someone could borrow one of MIT'S servers to test that segfault?
Jan 14 17:45:01 <janine>
BTW, Mohan asked me this morning if we have any test plans for dotLRN. The answer is currently no, correct? No-one has heard of any planning being done?
Jan 14 17:18:51 <jcdldn>
(Does anyone have an opinion on the aolserver decision not to make 3.5.2 (with i18n)? can we discuss after testing)
Jan 14 17:53:33 <jcdldn>
It seems like unless we pull the ad13 patches into 3.5.x they won't go in.
Jan 14 17:53:55 <jcdldn>
I think they would release a 3.5.2 blessed version with the patches if the work was done by someone else.
Jan 14 17:54:54 <donb>
Can we verify they'd do this? I would think a lot of people would want it
Jan 14 17:55:05 <jcdldn>
Also, it might just work to move to 4.0 although I have not looked at it at all. Certainly the tcl version in 3.5 matches 4.0 so that should work. Also, the db drivers should work.
Jan 14 17:55:21 <donb>
4.0's not even in beta yet, though, right?
Jan 14 17:55:46 <jcdldn>
JimD said they were going to move to it internally in about a month.
Jan 14 17:56:50 <donb>
Well ... if that timeframe holds up I'd move for using 4.0
Jan 14 17:57:44 <donb>
I spoke to an acquaintance here over coffee this morning, an Intel engineer who dabbles a lot with AOLserver and some with OpenACS. He's been using the nsd library 4.0 provides that can be linked with Tcl directly (which probably obsolete's Cleverly's work in this direction) and says it's been working great
Jan 14 17:57:48 <donb>
so obviously they're close
Jan 14 17:57:56 <jcdldn>
I think we should be testing it pre beta since if we don't ask for changes before it's released we are not likely to get them later.
Jan 14 17:58:54 <donb>
I know Jamie Rasmussen wants to add windows support for 4.0, too (and has actually got it working with the early versions)
Jan 14 17:59:06 <jcdldn>
I also think they are not 100% sure what we need on our end for i18n (which is clearly true since I don't know and I expect Lars does not either).
Jan 14 17:59:49 <lars>
Correct, I have no idea.
Jan 14 18:00:31 <lars>
One of the items on the remaining list [of i18n work] would be AOLserver 3.5/4.0 support.
Jan 14 18:00:47 <lars>
Should we concern ourselves with both, or just one of them?
Jan 14 18:00:52 <jcdldn>
If they get package require working out of the box we would also have hugely more options with tcl libraries which would be a big win as well.
Jan 14 18:05:28 <lars>
I haven't looked at either of 3.5 or 4.0 at all. Do any of you happen to know what the i18n support looks like in 3.3+ad13 compared to 3.5 compared to 4.0?
Jan 14 18:06:25 <danw>
I thought I had heard that 4.0 would have some support.
Jan 14 18:06:41 <jcdldn>
Jamie (I think) had ported the ad13 i18n bits to 4.0 but not really tested it. Also, they have done some things internal to 4.0 as well.
Jan 14 18:11:08 <donb>
OK so at this point we're still in the dark and need to figure out how to enlighten ourselves as to what's in 4.0 and what's not
Jan 14 18:11:26 <donb>
In other words we need a volunteer ...
Jan 14 18:12:31 <jcdldn>
I'll build a 4.0 version and just see what happens.
Jan 14 18:12:53 <donb>
I'm planning to do so, too, but I don't really know anything about multi-byte and shift_JIS
Jan 14 18:14:13 <donb>
Let's see if 4.6 builds first, I at least have been wanting to try out 4.0 on general principle
Jan 14 18:18:28 <jcdldn>
That was last week so maybe there will be a beta this week. Here was the AOLSERVER chat log:
Shmooved (7:32:55 PM): i would recommend to all waiting another week or so until we release a "blessed" 4.0 beta GizmoBeastLives (7:31:19 PM): Practically though, you are talking about a period of several months before 4.0 final, right? Shmooved (7:31:19 PM): there really aren't any good 4.0 docs yet Shmooved (7:31:28 PM): hopefully not that long Shmooved (7:31:41 PM): jim wants 4.0 installed here at aol in the next month or so
Jan 14 18:19:14 <jcdldn>
This all from last weeks AOLServer chat 2003-01-09
Jan 14 18:20:08 <donb>
OK Lars and I both seem to like the Jakarta doc as at least a good starting place for words for describing how we might organize our decision making ... judging from today's mail
Jan 14 18:20:52 <donb>
Do you two have comments on our notes? On my thinking that the kind of broad-based core developer group that the TIP is organizationally prepared for makes sense for the main OpenACS project?
Jan 14 18:20:54 <lars>
yes
Jan 14 18:20:56 <jcdldn>
I agree as well. I mostly like the TIP and w3c tag stuff as an example of transparency.
Jan 14 18:21:35 <donb>
Right and OpenACS needs to be very transparent, while dotLRN TAB needs to figure out a way to blend transparency with responsibility towards the consortium
Jan 14 18:21:55 <danw>
I agree with that.
Jan 14 18:22:44 <donb>
TAB needs to commuincate well but in essence it is a closed group...
Jan 14 18:23:10 <jcdldn>
Although I think donb should get Ousterhout like powers to resolve deadlocks.
Jan 14 18:23:22 <donb>
OK I think I have a good sense of what makes sense to summarize from the various docs Jeff's referred us too, then. Having this pre-definition narrows the scope of doing that summary ...
Jan 14 18:24:19 <donb>
I wonder if John's ever had to exercise that deadlock breaking power?
Jan 14 18:24:49 <jcdldn>
Given the scope I would assume so (and TIPs been around a while).
Jan 14 18:25:14 <jcdldn>
Should we start making the chat logs available? Or some sort of summary?
Jan 14 18:25:31 <lars>
I think we shuold
Jan 14 18:25:44 <lars>
Not just for other people's sake, but also for our own
Jan 14 18:25:52 <donb>
A summary I think ... like board meeting notes ... but I don't want to volunteer to keep them :(
Jan 14 18:25:54 <lars>
I'd vote for a summary,since reading chat logs is painful
Jan 14 18:26:04 <donb>
Yes, that's my only reason for supporting a summary
Jan 14 18:26:21 <donb>
"I lost my internet connection" "I had a power blink" ... people don't really need to read that every week
Jan 14 18:26:29 <danw>
We might also feel restricted in our discussions if we post the logs.
Jan 14 18:26:29 <jcdldn>
I will summarize. I also have the older logs so can shake out anything useful from them.
Jan 14 18:26:45 <lars>
excellent, thanks
Jan 14 18:27:47 <donb>
We can then post the summaries ... maybe in the dotLRN forum as well as in our file storage space at sloan?
Jan 14 18:28:13 <jcdldn>
Or under the dotlrn project at openacs.org?
Jan 14 18:29:21 <donb>
Yeah maybe the dotlrn project at openacs.org ... it would be nice to skin it with the dotlrn site look just as a demo, too (it's a subsite or at least that was my request)
Jan 14 18:31:13 <donb>
Let's talk about workflow and roles at least in the 4.7 context for a moment, could we do that?
Jan 14 18:31:23 <lars>
and on making it easy to use, of course.
Jan 14 18:33:11 <donb>
Lars ... in the new workflow package are you building in workflow-specific roles as was done in the old one? My thinking is that relational segments are a natural for that and as my intent is to provide some intellegent admin UI for user and group/subgroup management in 4.7 I've been hoping that we don't end up with yet another role-definition mechanism
Jan 14 18:34:25 <lars>
The short answer is: Yes.
Jan 14 18:34:38 <lars>
Take a look at the spec, in particular roles and assignments.
Jan 14 18:34:52 <donb>
I've looked at the spec but to be honest didn't look close enough, will do so again
Jan 14 18:35:09 <lars>
I think "roles" (or some other term, doesn't have to be roles) are necessary as a separate workflow concept.
Jan 14 18:35:21 <lars>
In bug-tracker, the roles would be "submitter", and "assignee".
Jan 14 18:35:51 <lars>
But who gets to "play" those roles for a particular bug will be different for each bug.
Jan 14 18:36:14 <lars>
Whereas who are *allowed* to play certain roles would likely map to an OpenACS group or relational segment.
Jan 14 18:36:41 <donb>
I see ...
Jan 14 18:36:56 <lars>
On the other hand, I'm not too well-versed in rel segments and all that stuff, so there might be some way of utilizing the kernel features for this that I don't know of.
Jan 14 18:37:36 <donb>
relational segments need a group on one side and a party on the other side ...
Jan 14 18:38:48 <donb>
On the other hand you could define an acs-relation that defines a bug object on one side and a party on the other to implement your mapping ... whether or not it's worth it is an open question because acs-relations aren't as lightweight as they could be (but they were meant to play this kind of object-object mapping role and allow the defining of constraints to guard against coding errors etc)
Jan 14 18:38:49 <lars>
I don't think they'd make any sense here, but I don't know.
Jan 14 18:39:04 <lars>
Yes, acs_rels, right?
Jan 14 18:39:11 <donb>
Yes acs_rels ...
Jan 14 18:39:39 <lars>
Yeah, I asked about using acs_rels for assigning users and stuff to bugs back when I built bug-tracker,and people seemed to think it was overkill.
Jan 14 18:40:36 <donb>
Well ... I agree that they're overkill but only because the implementation is overdone IMO. On the other hand there's exisitng Tcl API in acs-subsites that makes using them extremely easy.
Jan 14 18:41:12 <lars>
So what's the verdict? I'd really appreciate your input on this (all 3 of you), if you had the time to look over the spec. Or even the data model and code from CVS (but email me before you do, so we can make sure we've committed the latest version of everything)
Jan 14 18:41:20 <donb>
The "populate" package I mention makes reltypes for user mapping so that can give us some sense of scalability ...
Jan 14 18:42:17 <donb>
I need to think about acs_rels usage. If the impact on scalability isn't significant I'd like to see us use them because we could use a good, cleanly-written example in a package that's not too difficult to understand and that's actually completed.
Jan 14 18:42:28 <jcdldn>
Side note: On the service-contract side Til said he would volunteer to migrate old sc implementations to the new api
Jan 14 18:42:50 <lars>
Jcd: Nice.
Jan 14 18:43:00 <donb>
One thing I'll say is that acs_rels has *not* been a factor in dotLRN scalability thus far ... it's purely permissions stuff that kills us. And yes, Jeff, nice!
Jan 14 18:43:25 <donb>
Anyway ... can I get back to you in a few days after I explore acs_rels a bit more?
Jan 14 18:43:32 <lars>
Absolutely.
Jan 14 18:43:36 <danw>
You've tested with a large number of rel segs?
Jan 14 18:43:50 <donb>
SloanSpace V2 has about 2,000 of them
Jan 14 18:44:00 <lars>
We'll keep not doing acs_rels, I don't think changing it will be that much of a head-ache, at least not until we have to worry about an upgrade path,which is still at least 3 weeks away.
Jan 14 18:44:02 <donb>
that's not huge
Jan 14 18:44:09 <danw>
On oracle - right?
Jan 14 18:44:32 <jcdldn>
Do you know if sloan is going to consider moving to 4.7?
Jan 14 18:44:59 <donb>
Yes, Dan - though I don't forsee PG being much different. It's just the big union views that kill ...
Jan 14 18:45:21 <danw>
Yes, I was thinking about some of those views in the rel segs data model.
Jan 14 18:46:25 <donb>
My Grand Notion for solving the perms problem and all related issues is to denormalize the party-member-map view ... well, in essence to make it a materialized view (by hand, in PG at least, I need to explore materialized views in Oracle)
Jan 14 18:46:32 <donb>
Good-bye union views
Jan 14 18:46:39 <donb>
It's those unions that are killing us
Jan 14 18:47:02 <donb>
Also ... OK, let me explain the results from a little archeology I undertook by accident last week ...
Jan 14 18:47:50 <donb>
There are ancient upgrade scripts lying around from 4.1.1->4.2. They build the membership groups and relsegs for acs-subsites. They migrate all registered users to these groups. The notion was to get rid of the "-2" registered users hack (registered users are just members of "/" subsite)
Jan 14 18:49:05 <donb>
All that migration code was written. What *wasn't* completed was the changes to acs-subsite's registration code to register people into the "/" membership groups into registered users, nor was package code rewritten to correctly check for membership rather than registered users (packages today don't honor subsite memberships when checking to see who can read something, there are lots of registered user checks)
Jan 14 18:49:32 <donb>
So aD got about 50% done with this before 4.2 was released.
Jan 14 18:50:34 <donb>
Anyway the foundation's there for rationalizing groups in the way I've suggested off-and-on in the past. No more kludge group for registered users. Work off rel segments for perm checking not groups then rel segs (a membership relseg is built for each subsite group and this should be generalized).
Jan 14 18:51:05 <donb>
There's really little code to write here - it really looks like one day it was being worked on, then never changed afterwards
Jan 14 18:51:30 <donb>
Anyway three of the UNION clauses in those nasty views can go away when groups are rationalized
Jan 14 18:52:13 <jcdldn>
the 4.2 sites did not really need it (at least the ones I worked on) since they had very simple permissioning requirements other than for cms.
Jan 14 18:52:14 <donb>
If party-member-map were denormalized/materialized/whatever we'd need no UNION clauses at all and perm checking and various membership views would just consist of joining against a mapping table indexed on both sides
Jan 14 18:53:01 <donb>
Most sites won't need it but ... things will get simpler and should get faster
Jan 14 18:53:16 <lars>
Sounds good. Put it in the road map. :)
Jan 14 18:53:42 <donb>
Anyway ... there's already a mapping table for groups and parties so getting rid of that and making a giant party-party map wouldn't actually add that many rows to a site with lots of groups/subgroups.
Jan 14 18:53:45 <donb>
i.e. dotLRN
Jan 14 18:53:54 <donb>
perms scale horribly at the moment when there are lots of groups
Jan 14 18:54:07 <donb>
(sometimes even when there aren't but most certainly when there are)
Jan 14 18:54:18 <donb>
Yeah ... let's put together a road map!
Jan 14 18:54:56 <donb>
I'm a lot more optimistic about making perms scale than I was a couple of months ago ...
Jan 14 18:55:12 <jcdldn>
Don, do you think we will avoid needing tcl perm caching?
Jan 14 18:55:22 <donb>
Don't know ...
Jan 14 18:55:29 <donb>
For simple checks, yes
Jan 14 18:55:46 <donb>
And that's mostly what SSV2 caches
Jan 14 18:55:49 <lars>
So is this something that you'll be able to get done for the Sloan funding you have? for 4.7?
Jan 14 18:56:17 <donb>
I think so, Lars. My charter's very simple: "make dotLRN better" and when Al and I first talked I said "making perms scale is #1 on my list"
Jan 14 18:56:19 <jcdldn>
Also, the killer (even without groups) is the "all X that I have Y permission on" query...is that what you are looking at when trying to figure out what to denormalize?
Jan 14 18:57:13 <donb>
I *think* that once that view's not based on subselects/subviews with the UNION clauses that both PG and Oracle will be able to optimize the queries to join using index scans.
Jan 14 18:57:32 <donb>
The problem today is that the whole massive UNION select is built then nested loops galore run against it
Jan 14 18:57:59 <donb>
But if that weren't sufficient alone I'd be open to exposing more of the perm/party datamodel in queries ...
Jan 14 18:58:30 <donb>
abstraction's great and I'm all in favor but if abstraction kills performance and scalability if we can't figure out how to improve it, abstraction's gotta give way
Jan 14 18:58:35 <donb>
IMO of course
Jan 14 18:58:59 <jcdldn>
I would agree.
Jan 14 18:59:01 <lars>
Yes, of course. It's gotta work.
Jan 14 18:59:11 <donb>
What a refreshing attitude :)
Jan 14 18:59:34 <donb>
But breaking the abstraction may not be necessary once the UNIONs go away
Jan 14 19:00:14 <donb>
UNIONs in subselects are evil ... the aD folks didn't know that (and I can't blame them, it's only large amounts of data that exposed the problem and they wouldn't known when they designed it, they fell into a trap)
Jan 14 19:00:46 <donb>
OK enough for now. Anyone else have anything? I have a luncheon appointment at twelve and some stuff to do first ...
Jan 14 19:00:49 <jcdldn>
Should we try to collect some statistics from current users in terms of # of objects, users, groups, etc? I could write a little script to run to generate the #s and see if we could get some production sites to run it. Not sure if that would have value though.
Jan 14 19:01:33 <lars>
I think your approach with setting some goals and testing against those might actually have more value
Jan 14 19:01:36 <donb>
I think getting some numbers is an excellent idea. Though maybe Greenpeace (gobs and gobs of blob'd multimedia CR content) and SSV2 are sufficient
Jan 14 19:02:12 <donb>
Actually Lars, you're probably right. But SSV2 and GP give us some idea as to what a typical large-scale site may be carrying in terms of data
Jan 14 19:02:28 <donb>
But we want to see if we can scale to much larger numbers ...
Jan 14 19:02:51 <donb>
Being in a position to seriously tackle scalability is exciting regardless ...
Jan 14 19:04:00 <donb>
Jeff ... if you want to write a simple script to count users, objects, groups, relsegs, and cr_items I'm sure I can run them at least against GP and SSV2 ... it would be interesting
Jan 14 19:04:02 <jcdldn>
well, I have to run now too.
Jan 14 19:04:13 <jcdldn>
I will send something off to you.
Jan 14 19:04:22 <lars>
Okay, I'd appreciate it if you have the time to check out workflow design and the threads I've started about service contracts, auto-mount, etc.
Jan 14 19:04:23 <donb>
OK ...
Jan 14 19:04:37 <lars>
Other than that, I don't have anything else today, either.
Jan 14 19:04:39 <donb>
Haven't seen the service contract bit ...
Jan 14 19:04:57 <donb>
auto-mount I need to check back ... we really need to auto-mount properly whether or not you personally have time to fix it
Jan 14 19:05:07 <donb>
not honoring post-instantiation code sucks