Forum OpenACS Q&A: Weird PostgreSQL SELECT processing

Collapse
Posted by Vlad Seryakov on
I have a function that restricts records by user id. i put log notice
inside the function to see how it is used and i noticed that
PostgreSQL calles it too many times.
Below is an example, it should be called only once with id 177 but..

select p.placement_id
from placements p,
    (select placement_id,count(*) from placement_log log
      where placement_id=177 group by placement_id) l
where .placement_id=p.placement_id and
warp_user_access(0,'placement',p.placement_id);

NOTICE:  placement 149
NOTICE:  placement 35
NOTICE:  placement 36
NOTICE:  placement 37
NOTICE:  placement 38
NOTICE:  placement 137
NOTICE:  placement 39
NOTICE:  placement 139
NOTICE:  placement 143

Collapse
Posted by Jonathan Ellis on
I guess warp_user_access is the function you're talking about?

the "where placement_id = 177" is in your subquery. it doesn't look like you need a subquery here; just do something like

select p.placement_id, count(pl.*) as whatever
from placements p, placement_log pl
where p.placement_id = pl.placement_id
  and p.placement_id = 177
  and warp_user_access(...)
group by p.placement_id
Still, PG could evaluate the warp_user_access before the placement_id = 177 if the planner thinks it can throw away more rows that way.

(I'm assuming there's more to this query than what you're showing because otherwise you don't need to join to the placements table at all to get your count...)

Collapse
Posted by Don Baccus on
select p.placement_id
from placements p, (select placement_id,count(*)
                    from placement_log log
                    where placement_id=177 group by placement_id) l where placement_id=p.placement_id and
 warp_user_access(0,'placement',p.placement_id); 
It looks like it is calling the function before doing the join, which in general isn't a terribly bad idea though it can be in certain cases. Actually it's probably doing the qualification while doing the join which results in the same thing (calling the function for every row in placements) but I'd have to dig around inside PG to be sure.

If it makes you feel better Oracle frequently does the same thing.

Collapse
Posted by Vlad Seryakov on
Yes, this is a smal portion of the whole query i use, i retrieve more columns and i do not want to group by all of them.
i did notice this only when we got more than 100k records and the query became very slow. The trick here that subquery is used to do filtering, it is not fixed SQL statement it may contain more conditions.
It seems i need to figure out different method of doing this.
thanks
Collapse
Posted by Jonathan Ellis on
I agree that a large amount of GROUP BY columns is ugly, but I don't think it slows things down. May want to test that.

However you can also wrap things in another subquery to force the planner to do what you want:

select * from (
    select p.placement_id
    from placements p, (select placement_id,count(*)
                        from placement_log log
                        where placement_id=177 group by placement_id) l 
    where l.placement_id=p.placement_id
) as q1
where warp_user_access(0,'placement',p.placement_id); 
Collapse
Posted by Jonathan Ellis on
didn't proofread very well... that should of course be

where warp_user_access(0,'placement',q1.placement_id);

Collapse
Posted by Vlad Seryakov on
i tried this, it is the same, i did vacuum it helped for other queries but this one with GROUP BY still runs the functions for all records.
i sent question into pgsql mailing list too.
Collapse
Posted by David Walker on
  
select p.placement_id from placements p, (select  
placement_id,count(*),warp_user_access(0,'placement',p.placement_id)  
as warped_access from placement_log log where  
placement_id=177 group by placement_id) l where  
l.placement_id=p.placement_id and l.warped_access ;  

Try moving the function call into the select portion of a sub-query that is limited on placement_id. That way the function should only be executed for the placement_id(s) that are returned. Then refer to that alias in your outer query to limit on that.