Forum OpenACS Development: Compatibility Matrix

Collapse
Posted by Torben Brosten on
Hi all,

Has there been any progress in using versions of OpenACS components beyond the ones in the Compatibility Matrix ( https://openacs.org/xowiki/openacs-compatibility-matrix )?

Ubuntu 11.10 installs Postgresql 9.2 by default.

Are any known issues working with pg 9 on OpenACS (besides the pg8 config changes)?

Collapse
2: Re: Compatibility Matrix (response to 1)
Posted by Torben Brosten on
Looks like we just need to add an additional parameter to postgresql.conf:

http://wiki.postgresql.org/wiki/What%27s_new_in_PostgreSQL_9.0#PL.2FpgSQL_changes_which_may_cause_regressions

Any other known issues?

Collapse
3: Re: Compatibility Matrix (response to 1)
Posted by Torben Brosten on
Correction, ubuntu 11.10 installs pg 9.1 currently.
Collapse
4: Re: Compatibility Matrix (response to 1)
Posted by Torben Brosten on
For tsearch2:

http://www.postgresql.org/docs/9.0/static/tsearch2.html

These upgrade directions are ambiguous for anyone, such as myself, who hasn't worked with tsearch2 internals etc.

Looks like OpenACS is going to need some kind of recipe/script to convert to pg9. :/

Collapse
5: Re: Compatibility Matrix (response to 1)
Posted by Torben Brosten on
Barebones recipe:

http://sql-info.de/postgresql/notes/converting-tsearch2-to-8.3.html

Not sure how this affects OpenACS search (or other) package code..

Collapse
6: Re: Compatibility Matrix (response to 5)
Posted by Ryan Gallimore on
As of PG 8.3, tsearch2 is included. All you have to do is install the tsearch2-driver package.
Collapse
7: Re: Compatibility Matrix (response to 6)
Posted by Torben Brosten on
Hi Ryan,

I see that it's included, but the transition doesn't appear trivial. Does the tsearch2-search package automatically adjust to the changes?

Collapse
8: Re: Compatibility Matrix (response to 1)
Posted by Eduardo Santos on
Hi torben,

As this subject came on topic, I had REALLY SERIOUS issues trying to upgrade to PostgreSQL 8.4. It seems like the optimizer changes made some of my system queries so slow it was just impossible to use it.

Take a look at these explain analyze:

PostgreSQL 8.4: http://explain.depesz.com/s/oHZ
PostgreSQL 8.3: http://explain.depesz.com/s/PGp

The index stats are the bigger concern.

PostgreSQL 8.4:


Table name Scan count Total time % of query
scan type count sum of times % of table
acs_object_context_index 1 43104.852 ms 45.0 %
Index Scan 1 43104.852 ms 100.0 %
acs_objects 1 1.072 ms 0.0 %
Index Scan 1 1.072 ms 100.0 %
acs_permissions 1 17.085 ms 0.0 %
Index Scan 1 17.085 ms 100.0 %
acs_privilege_descendant_map 1 1.675 ms 0.0 %
Seq Scan 1 1.675 ms 100.0 %
forums_forums 1 0.899 ms 0.0 %
Seq Scan 1 0.899 ms 100.0 %
party_approved_member_map 1 1.206 ms 0.0 %
Index Scan 1 1.206 ms 100.0 %
site_nodes 1 17.956 ms 0.0 %
Index Scan

PotgreSQL 8.3:


Table name Scan count Total time % of query
scan type count sum of times % of table
acs_objects 1 1085.869 ms 27.9 %
Index Scan 1 1085.869 ms 100.0 %
acs_permissions 1 450.870 ms 11.6 %
Index Scan 1 450.870 ms 100.0 %
forums_forums 1 56.911 ms 1.5 %
Bitmap Heap Scan 1 56.911 ms 100.0 %
party_approved_member_map 1 494.312 ms 12.7 %
Index Scan 1 494.312 ms 100.0 %
site_nodes 1 38.994 ms 1.0 %
Index Scan 1 38.994 ms 100.0 %

The index acs_object_context_index is not an issue in PostgreSQL 8.3 and became a big issue in PostgreSQL 8.4. Maybe somebody with better knowledge about PostgreSQL can help to clarify this issue.

Collapse
9: Re: Compatibility Matrix (response to 8)
Posted by Gustaf Neumann on
Eduardo,

You are right about the changes in PostgreSQL, which became worse for largish OpenACS/DotLRN sites after 8.3; for smallish sites, 8.4+ is ok. Therefore that's not a question about compatibility, but about recommended versions depending on the site's needs.

Starting from PostgreSQL 8.4, the postgres optimizer prefers in more cases sequential scans (at least with pretty standard pg-config files). This is especially bad for permissions (and search) in large sites with dotlrn. For this reason, we kept our production site a long time in PostgreSQL 8.3.*, and we hoped that the postgres optimizer in newer versions would behave better for these cases.

During this summer, we lost our patience and changed in our main installations the permission management to recursive queries [1] and switched to PostgreSQL 9.0.*. The change involved as well changes in the basic data-model (the huge materialized transitive tables). The work was done mostly by Victor Guerra. As a result, permission queries are now for our site significantly faster (often 4 times) than with 8.3. I think, there is still some need for cleanup in our installation (Victor is currently on vacation), but we will commit our changes back to CVS when ready.

-gustaf neumann

[1] http://www.postgresql.org/docs/9.0/static/queries-with.html

Collapse
10: Update on PG 9.0 changes? (response to 9)
Posted by Dave Bauer on
Hi Gustaf,

Is there any timeframe for this code to be available?

Thanks!
Dave

Collapse
Posted by Victor Guerra on
Hi Dave,

As Gustaf mentioned, we made some modifications in our installation in order to have a better performance when moving from PG 8.3 to PG 9.0. The changes mostly affect permissions.

Basically one has to run a couple of scripts and change some xql files. So the code is not in reality much. Everything is packaged in its own package so applying the changes should not be hard for a stand alone openacs installation; the only thing is that steps for upgrading need to be well stablished so people dont get lost. So, Ill try to dedicate a couple of hours this and next week and do some clean up and docu ( and some testing ;) ).

Best,

Collapse
Posted by Eduardo Santos on
Hi Victor,

Is there anything I can do to help you? This is something I really need.

Collapse
24: Re: Compatibility Matrix (response to 9)
Posted by Dave Bauer on
Hi,

I have applied Victors changes and it works fine except when trying to update the context hierarchy or set permissions inheritance since it then fires the triggers to maintain the tables.

The code I got from Victor did not include any data model changes or changes to the triggers so I am not sure where to go from here.

I know Victor is on vacation but if someone from Vienna has some information, or anyone else, and I am leaving this message as a record of my progress.

Thanks
Dave

Collapse
25: Re: Compatibility Matrix (response to 24)
Posted by Gustaf Neumann on
Hi Dave,

not sure, about what exact set of changes victor has sent to you (we did most of the work in may 2010); however, the huge tree of the context hierarchy is not needed any more with the recursive queries (from the main openacs/dotlrn applications; one never knows about custom modifications) and was just kept to switch easier back to the old approach in case something was wrong. As far as i know, victor has sent the changes to a few expert users months ago to get some feedback about potential problems.

-gustaf

Collapse
26: Re: Compatibility Matrix (response to 25)
Posted by Dave Bauer on
Here is where I am experiencing some issues

2.5 seconds:
update acs_objects
set security_inherit_p = 'f'
where object_id = '1290509'

1.6 seconds
select acs_permission__grant_permission(
'182879',
'27310',
'read'
);

So it appears that these are probably updating acs_object_party_privilege_map and related tables via triggers with this which will get really slow if you need to update more than 1 object. Its already pretty slow, several seconds to change permissions on 1 object.

Collapse
27: Re: Compatibility Matrix (response to 26)
Posted by Dave Bauer on
I checked and I did update shared buffers to 128mb in my PG 9.1 config.
Collapse
Posted by Dave Bauer on
Very interesting results.

If you try to change the access policy on a bugtracker instance it calls set_inherit and grant_permission on every bug.

This can be hundreds of objects. The first ones took 1-2 seconds per query as the page runs by the end the last objects take 1 ms.

It appears Postgresql is loading the tables/indexes into memory and as that happens to seems to help.

Maybe this is not a signifigant issue unless you try to issue hundreds of small queries in one submission.

Collapse
29: Re: Compatibility Matrix (response to 25)
Posted by Victor Guerra on
Dave,

as Gustaf already mentioned in a previous post, the changes we made sometime ago were done in a way that switching back to the original implementation of permissions was done as easy as possible.

Therefore the code that I provided to you ( which can be found here: https://github.com/vguerra/permissions ) does not alter the current data model. By using recursive queries some tables and triggers are not necessary anymore but this is something we still have to do locally.

As soon as we make progress on that I'll provide the changes.

Best,

Collapse
15: Re: Compatibility Matrix (response to 8)
Posted by Don Baccus on
Try reducing random_page_cost to something lower than the default value 4, and raising the value of effective_cache_size to the size of your shared buffers + something reasonably large for the OS filesystem cache (and remember, modern disk drives cache too!).

We were having problems with unwanted sequential scans where indexed scans were obviously 50x faster in a production system I'm working on (NOT OpenACS/dotLRN) and by fiddling with these parameters I was able to persuade the planner to use indexed scans where it seemed to me it should.

I set random_page_cost to 1.0 ...

Collapse
16: Re: Compatibility Matrix (response to 15)
Posted by Don Baccus on
This was PG 9.1 BTW.

Let me also add that I tried recasting one of the offending queries several ways (in (SELECT ... IN (SELECT..., joins, join on subquery in FROM clause, etc) and got the same plan each time (except the nested IN SELECT forms lead to semi joins rather than full joins). That's good in that the planner's able to build a canonical form from each of my twisted efforts to fool it that's equivalent to the deeper meaning of what I was trying to do. It's bad in that each plan had a sequential scan of (gulp) 69 million rows!!!!

Collapse
13: Re: Compatibility Matrix (response to 1)
Posted by Dave Bauer on
I can also help with testing so please let me know how I can help.
Collapse
14: Re: Compatibility Matrix (response to 13)
Posted by Victor Guerra on
Thank you Dave and Eduardo, indeed help on testing would be very much appreciated =). I'll get back to you soon.
Collapse
17: Re: Compatibility Matrix (response to 1)
Posted by Torben Brosten on
One hopes that sequential scans include an interpolation process on sorted keys to help reduce the scan range.
Collapse
18: Re: Compatibility Matrix (response to 17)
Posted by Don Baccus on
As someone who understands db implmentations quite well, I gotta say this makes no sense.

What you describe can be used to filter ouput to the client of the scan, but no, table data's not sorted so seq scans do scan the whole table.

Absent partitioning, I guess, which is useful for data warehousing, not dynamic websites.

Collapse
19: Re: Compatibility Matrix (response to 18)
Posted by Torben Brosten on
Hi Don,

Good to know. Thank you.

It seems to me that smart (whatever it's called) interpolation of an ordered set could be faster than sequencing once the count reaches some massive number.. By "smart" I'm referring to a modified conventional mathematical interpolation, where the next best guess uses the last two data points to suggest an intersecting line and intersection point to jump to --providing the next best guess.

Assuming 69 million rows, ordered, non-linear values, one could probably cut to search range with a very small count within just a few steps.. Would be fun to play with.

Anyway, I've successfully used a technique like this to reduce the interpolation loops of some business math functions to something reasonable for dynamic web pages.. About 12 iterations on projects with millions of values, non-linear. I'd be glad to share, if you know someone that want to pursue this.

cheers,

Collapse
21: Re: Compatibility Matrix (response to 19)
Posted by Don Baccus on
"Assuming 69 million rows, ordered"

But there's no order imposed on the rows in the db. Unless you insert them in a certain order and never delete or update even one single row in the table.

I suppose there are tables like that but it's certainly not typical of how a database is used.

Collapse
22: Re: Compatibility Matrix (response to 21)
Posted by Torben Brosten on
Hi Don,

Yeah, hair brain idea. I get it. I just passed the idea on to someone that could do something about it if there was anything to it; They didn't see it applicable to DB scans either.

Collapse
20: Re: Compatibility Matrix (response to 1)
Posted by Patrick Giagnocavo on
Does the postgresql.conf setting, default_statistics_target, have any effect on such large table scans? I remember for one client, changing from the default value to "500" seemed to make things better (you have to do an ANALYZE after changing the value).
Collapse
23: Re: Compatibility Matrix (response to 1)
Posted by Gustaf Neumann on
Before we decided in our installation to head toward recursive queries, we were fiddling as well with the page costs. We could persuade in some queries pg to use the index, in some queries we had to turn off the optimizer, and we had to rearrange queries, etc. Improving the results from some queries often worsen the results on others, some queries were still not effected, the results were not really encouraging; furthermore, for packaging openacs in distros, recommended configuration changes make life more complicated.

From our experience getting rid of the huge materialized tree tables (having many duplicate keys), and move towards recursive queries looks much more promising. As effect, e.g. permission queries are now several times faster with 9.0 then they were with 8.3 (we are using the recursive queries in production since summer). However, this path requires some data model changes. Victor has the packaging of the changes in a more or less non-destructive way mostly done, just a matter of days. The package will be made available to the brave (i.e. early adapters). Depending on the results we can decide via TIP, whether that should go into the main branches.