Forum OpenACS Development: Fixing a very slow query

Posted by Janine Ohmer on 05/23/09 12:19 AM

The following is from file-storage/www/folder-list-postgresql.xql, a file which no longer exists on HEAD. It's running very slowly in one of my sites:

select lpad('  ',12 * tree_level(i.tree_sortkey),'  ') as spaces,
(select f.label from cr_folders f where f.folder_id = i.item_id) as label,
(select f.folder_id from cr_folders f where f.folder_id = i.item_id) as new_parent
from cr_items i, cr_items l
where acs_permission__permission_p(i.item_id,'0','write')
and exists (select 1 from cr_folders f where f.folder_id = i.item_id)

and not exists (select 1
from cr_items j, cr_items k
where i.item_id = j.item_id
and j.item_id != '78948'
and j.tree_sortkey between k.tree_sortkey and tree_right(k.tree_sortkey)
and k.item_id = '78948')

and i.tree_sortkey between l.tree_sortkey and tree_right(l.tree_sortkey)
and l.item_id = file_storage__get_root_folder('522')
order by i.tree_sortkey;

I did some searches and found a few suggestions to use "in" instead of "exists", so I tried changing the middle clause to use "and 1 not in " but it only helped slightly.

My hope was that this would have been optimized already in a newer version of OpenACS, but instead it has been removed. Maybe that means it can't be made better?

Explain Analyze says

Sort (cost=30476.68..30477.01 rows=132 width=15) (actual time=25913.139..25913.169 rows=15 loops=1)
Sort Key: i.tree_sortkey
Sort Method: quicksort Memory: 18kB
-> Nested Loop (cost=0.50..30472.03 rows=132 width=15) (actual time=19.707..25912.730 rows=15 loops=1)
-> Index Scan using cr_items_pk on cr_items l (cost=0.25..8.52 rows=1 width=11) (actual time=0.342..0.348 rows=1 loops=1)
Index Cond: (item_id = file_storage__get_root_folder(522))
-> Index Scan using cr_sortkey_idx on cr_items i (cost=0.25..30041.77 rows=132 width=15) (actual time=15.755..25905.866 rows=15 loops=1)
Index Cond: ((i.tree_sortkey >= l.tree_sortkey) AND (i.tree_sortkey <= tree_right(l.tree_sortkey)))
Filter: (acs_permission__permission_p(i.item_id, 0, 'write'::character varying) AND (subplan) AND (NOT (subplan)))
SubPlan
-> Nested Loop (cost=0.00..16.81 rows=1 width=0) (actual time=0.094..0.094 rows=0 loops=15)
Join Filter: ((j.tree_sortkey >= k.tree_sortkey) AND (j.tree_sortkey <= tree_right(k.tree_sortkey)))
-> Index Scan using cr_items_pk on cr_items j (cost=0.00..8.27 rows=1 width=11) (actual time=0.032..0.036 rows=1 loops=15)
Index Cond: ($0 = item_id)
Filter: (item_id <> 78948)
-> Index Scan using cr_items_pk on cr_items k (cost=0.00..8.27 rows=1 width=11) (actual time=0.020..0.025 rows=1 loops=15)
Index Cond: (k.item_id = 78948)
-> Seq Scan on cr_folders f (cost=0.00..1.34 rows=1 width=0) (actual time=0.074..0.074 rows=0 loops=5301)
Filter: (folder_id = $0)
SubPlan
-> Seq Scan on cr_folders f (cost=0.00..1.34 rows=1 width=4) (actual time=0.022..0.033 rows=1 loops=15)
Filter: (folder_id = $0)
-> Seq Scan on cr_folders f (cost=0.00..1.34 rows=1 width=9) (actual time=0.026..0.037 rows=1 loops=15)
Filter: (folder_id = $0)
Total runtime: 25913.827 ms
(25 rows)

The sequential scans look cheap; it seems to be the sorting that is the most expensive.

Any suggestions? Thanks in advance!

2: Re: Fixing a very slow query (response to 1)

Posted by russ m on 08/05/09 09:37 AM

I've got no idea if you still care about this, but...

it's not the sort, it's the big outer nested loop that's killing you... the first result is returned 19.707ms after it starts, and the last is almost 26 seconds later... the outer sort starts ~1ms after the nested loop completes and takes less than a millisecond...

the innermost seq. scan on cr_folders is the real issue - it's cheap but is executed 5301 times in that example... it's also odd that cr_folders is being seq. scanned when they're all doing is a direct match against folder_id - is there any chance that cr_folders.folder_id is not indexed, or the db hasn't been vacuumed in a while?

a rough guess is (depending on your PG version and it's planner) you might see a worthwhile improvement by shifting the "and not exists (select 1 from ...)" condition above "and exists (select 1 from cr_folders f where f.folder_id = i.item_id)" and see if the planner is smart enough to take the hint about checking item_id and tree_sortkey before seq. scanning cr_folders to see if the item in question happens to be a folder...

3: Re: Fixing a very slow query (response to 2)

Posted by Richard Hamilton on 08/05/09 10:52 PM

I thought that the query planner was supposed to be clever enough to select the best method! I also have always been led to believe that 'exists' is superior to 'in' because the subquery need only return a boolean value rather that requiring a sequential search through the subquery output.

I remember asking questions about optimising the permissions queries in OpenACS after seeing some very strange query times. It was Don who explained that 'vacuum analyse' in postgres collects stats and allows the planner to make much better decisions.

The combination of indices on the integer keys and multiple runs of vacuum analyse should result in a much improved execution plan.

5: Re: Fixing a very slow query (response to 3)

Posted by Don Baccus on 08/07/09 06:53 PM

Statistics-based optimization will always be fooled at times, unfortunately.

Regarding "exists" vs. "in", you're probably remembering advice from long ago, as the PG people figured out how to implement "in" as a special kind of join-like operation some years ago, thus making it as efficient in PG as it had been in Oracle.

In the past "in" was always extremely slow.

"exists" should still be faster, though.

6: Re: Fixing a very slow query (response to 2)

Posted by Janine Ohmer on 08/14/09 10:55 PM

Thanks! Although I eventually punted on that particular query, your post was helpful to me in reading the explain output for another one just a few days ago. I find the documentation for explain to be rather... opaque, to say the least.

4: Re: Fixing a very slow query (response to 1)

Posted by russ m on 08/06/09 04:09 AM

richard - yes, it is, but sometimes it's not quite so smart... particularly if the table stats are incorrect (as is implied by the seq. scanning of cr_folders)...