Forum OpenACS Development: Re: Compatibility Matrix

Collapse
8: Re: Compatibility Matrix (response to 1)
Posted by Eduardo Santos on
Hi torben,

As this subject came on topic, I had REALLY SERIOUS issues trying to upgrade to PostgreSQL 8.4. It seems like the optimizer changes made some of my system queries so slow it was just impossible to use it.

Take a look at these explain analyze:

PostgreSQL 8.4: http://explain.depesz.com/s/oHZ
PostgreSQL 8.3: http://explain.depesz.com/s/PGp

The index stats are the bigger concern.

PostgreSQL 8.4:


Table name Scan count Total time % of query
scan type count sum of times % of table
acs_object_context_index 1 43104.852 ms 45.0 %
Index Scan 1 43104.852 ms 100.0 %
acs_objects 1 1.072 ms 0.0 %
Index Scan 1 1.072 ms 100.0 %
acs_permissions 1 17.085 ms 0.0 %
Index Scan 1 17.085 ms 100.0 %
acs_privilege_descendant_map 1 1.675 ms 0.0 %
Seq Scan 1 1.675 ms 100.0 %
forums_forums 1 0.899 ms 0.0 %
Seq Scan 1 0.899 ms 100.0 %
party_approved_member_map 1 1.206 ms 0.0 %
Index Scan 1 1.206 ms 100.0 %
site_nodes 1 17.956 ms 0.0 %
Index Scan

PotgreSQL 8.3:


Table name Scan count Total time % of query
scan type count sum of times % of table
acs_objects 1 1085.869 ms 27.9 %
Index Scan 1 1085.869 ms 100.0 %
acs_permissions 1 450.870 ms 11.6 %
Index Scan 1 450.870 ms 100.0 %
forums_forums 1 56.911 ms 1.5 %
Bitmap Heap Scan 1 56.911 ms 100.0 %
party_approved_member_map 1 494.312 ms 12.7 %
Index Scan 1 494.312 ms 100.0 %
site_nodes 1 38.994 ms 1.0 %
Index Scan 1 38.994 ms 100.0 %

The index acs_object_context_index is not an issue in PostgreSQL 8.3 and became a big issue in PostgreSQL 8.4. Maybe somebody with better knowledge about PostgreSQL can help to clarify this issue.

Collapse
9: Re: Compatibility Matrix (response to 8)
Posted by Gustaf Neumann on
Eduardo,

You are right about the changes in PostgreSQL, which became worse for largish OpenACS/DotLRN sites after 8.3; for smallish sites, 8.4+ is ok. Therefore that's not a question about compatibility, but about recommended versions depending on the site's needs.

Starting from PostgreSQL 8.4, the postgres optimizer prefers in more cases sequential scans (at least with pretty standard pg-config files). This is especially bad for permissions (and search) in large sites with dotlrn. For this reason, we kept our production site a long time in PostgreSQL 8.3.*, and we hoped that the postgres optimizer in newer versions would behave better for these cases.

During this summer, we lost our patience and changed in our main installations the permission management to recursive queries [1] and switched to PostgreSQL 9.0.*. The change involved as well changes in the basic data-model (the huge materialized transitive tables). The work was done mostly by Victor Guerra. As a result, permission queries are now for our site significantly faster (often 4 times) than with 8.3. I think, there is still some need for cleanup in our installation (Victor is currently on vacation), but we will commit our changes back to CVS when ready.

-gustaf neumann

[1] http://www.postgresql.org/docs/9.0/static/queries-with.html

Collapse
10: Update on PG 9.0 changes? (response to 9)
Posted by Dave Bauer on
Hi Gustaf,

Is there any timeframe for this code to be available?

Thanks!
Dave

Collapse
Posted by Victor Guerra on
Hi Dave,

As Gustaf mentioned, we made some modifications in our installation in order to have a better performance when moving from PG 8.3 to PG 9.0. The changes mostly affect permissions.

Basically one has to run a couple of scripts and change some xql files. So the code is not in reality much. Everything is packaged in its own package so applying the changes should not be hard for a stand alone openacs installation; the only thing is that steps for upgrading need to be well stablished so people dont get lost. So, Ill try to dedicate a couple of hours this and next week and do some clean up and docu ( and some testing ;) ).

Best,

Collapse
Posted by Eduardo Santos on
Hi Victor,

Is there anything I can do to help you? This is something I really need.

Collapse
24: Re: Compatibility Matrix (response to 9)
Posted by Dave Bauer on
Hi,

I have applied Victors changes and it works fine except when trying to update the context hierarchy or set permissions inheritance since it then fires the triggers to maintain the tables.

The code I got from Victor did not include any data model changes or changes to the triggers so I am not sure where to go from here.

I know Victor is on vacation but if someone from Vienna has some information, or anyone else, and I am leaving this message as a record of my progress.

Thanks
Dave

Collapse
25: Re: Compatibility Matrix (response to 24)
Posted by Gustaf Neumann on
Hi Dave,

not sure, about what exact set of changes victor has sent to you (we did most of the work in may 2010); however, the huge tree of the context hierarchy is not needed any more with the recursive queries (from the main openacs/dotlrn applications; one never knows about custom modifications) and was just kept to switch easier back to the old approach in case something was wrong. As far as i know, victor has sent the changes to a few expert users months ago to get some feedback about potential problems.

-gustaf

Collapse
26: Re: Compatibility Matrix (response to 25)
Posted by Dave Bauer on
Here is where I am experiencing some issues

2.5 seconds:
update acs_objects
set security_inherit_p = 'f'
where object_id = '1290509'

1.6 seconds
select acs_permission__grant_permission(
'182879',
'27310',
'read'
);

So it appears that these are probably updating acs_object_party_privilege_map and related tables via triggers with this which will get really slow if you need to update more than 1 object. Its already pretty slow, several seconds to change permissions on 1 object.

Collapse
27: Re: Compatibility Matrix (response to 26)
Posted by Dave Bauer on
I checked and I did update shared buffers to 128mb in my PG 9.1 config.
Collapse
Posted by Dave Bauer on
Very interesting results.

If you try to change the access policy on a bugtracker instance it calls set_inherit and grant_permission on every bug.

This can be hundreds of objects. The first ones took 1-2 seconds per query as the page runs by the end the last objects take 1 ms.

It appears Postgresql is loading the tables/indexes into memory and as that happens to seems to help.

Maybe this is not a signifigant issue unless you try to issue hundreds of small queries in one submission.

Collapse
29: Re: Compatibility Matrix (response to 25)
Posted by Victor Guerra on
Dave,

as Gustaf already mentioned in a previous post, the changes we made sometime ago were done in a way that switching back to the original implementation of permissions was done as easy as possible.

Therefore the code that I provided to you ( which can be found here: https://github.com/vguerra/permissions ) does not alter the current data model. By using recursive queries some tables and triggers are not necessary anymore but this is something we still have to do locally.

As soon as we make progress on that I'll provide the changes.

Best,

Collapse
15: Re: Compatibility Matrix (response to 8)
Posted by Don Baccus on
Try reducing random_page_cost to something lower than the default value 4, and raising the value of effective_cache_size to the size of your shared buffers + something reasonably large for the OS filesystem cache (and remember, modern disk drives cache too!).

We were having problems with unwanted sequential scans where indexed scans were obviously 50x faster in a production system I'm working on (NOT OpenACS/dotLRN) and by fiddling with these parameters I was able to persuade the planner to use indexed scans where it seemed to me it should.

I set random_page_cost to 1.0 ...

Collapse
16: Re: Compatibility Matrix (response to 15)
Posted by Don Baccus on
This was PG 9.1 BTW.

Let me also add that I tried recasting one of the offending queries several ways (in (SELECT ... IN (SELECT..., joins, join on subquery in FROM clause, etc) and got the same plan each time (except the nested IN SELECT forms lead to semi joins rather than full joins). That's good in that the planner's able to build a canonical form from each of my twisted efforts to fool it that's equivalent to the deeper meaning of what I was trying to do. It's bad in that each plan had a sequential scan of (gulp) 69 million rows!!!!