Forum OpenACS Q&A: Response to Faster Selects with study of EXPLAIN

Collapse
Posted by Don Baccus on
First of all, the cost estimates should not be used to attempt to decide which of the two approaches is faster.

If you look at the estimate on the cost of the select on the users_contact table in the JOIN query you'll see it is estimating that roughly ten times as many rows will be returned than in the non-join case.  This is clearly bogus, as only one row will be returned for the entire query.  That 10x factor right there explains why EXPLAIN makes you think the query will run at least 9x slower than doing the two separately.

These cost estimates are only used by the optimizer to decide what kind of scans and joins to use, they're not intended for anything else.  The estimates are known to be relatively obtuse as they are based on very minimal statistics kept for the table.  PG 7.2 keeps additional statistics which should help the optimizer make better decisions, but the underlying notion that the cost estimate is just an estimate has not changed.

In general a join with a good plan will run faster than two queries.

Stacking the deck further against breaking your queries up, two queries mean two round-trips from the driver to the DB, two parses, two optimization passes, two executions with two resultsets being passed back to the drive, stuffed into Tcl vars, and then read by your code.

As far as your view question goes, PG executes views dynamically so normally the execution time will be roughly the time taken for the query written out by hand.