Forum OpenACS Development: Re: Fixing a very slow query
it's not the sort, it's the big outer nested loop that's killing you... the first result is returned 19.707ms after it starts, and the last is almost 26 seconds later... the outer sort starts ~1ms after the nested loop completes and takes less than a millisecond...
the innermost seq. scan on cr_folders is the real issue - it's cheap but is executed 5301 times in that example... it's also odd that cr_folders is being seq. scanned when they're all doing is a direct match against folder_id - is there any chance that cr_folders.folder_id is not indexed, or the db hasn't been vacuumed in a while?
a rough guess is (depending on your PG version and it's planner) you might see a worthwhile improvement by shifting the "and not exists (select 1 from ...)" condition above "and exists (select 1 from cr_folders f where f.folder_id = i.item_id)" and see if the planner is smart enough to take the hint about checking item_id and tree_sortkey before seq. scanning cr_folders to see if the item in question happens to be a folder...
I remember asking questions about optimising the permissions queries in OpenACS after seeing some very strange query times. It was Don who explained that 'vacuum analyse' in postgres collects stats and allows the planner to make much better decisions.
The combination of indices on the integer keys and multiple runs of vacuum analyse should result in a much improved execution plan.
Regarding "exists" vs. "in", you're probably remembering advice from long ago, as the PG people figured out how to implement "in" as a special kind of join-like operation some years ago, thus making it as efficient in PG as it had been in Oracle.
In the past "in" was always extremely slow.
"exists" should still be faster, though.