Forum OpenACS Development: Postgres and index usage

Collapse
Posted by Malte Sussdorff on
I was (til now) under the assumption, that postgres would use unique indexes to prevent full table scans. It seems not to do so.

My example: parties

parties has an index parties_email_un. If you check the execution path with explain though, it will show you that it is not using the unique index. Only after I created a special index "create index parties_email_idx on parties (email)" did it use this index.

If I'm not utterly mistaken, the email get's looked up quite often in OpenACS. So maybe I missed some parameter to make sure postgres is using an index on the email column. Or maybe it is using the parties_email_un index, but explain does not say so.

OTOH, if I did not miss anything, I'd strongly suggest to get some indexes in place with postgres really soon now (e.g. within the 5.0 release), to make sure even the postgres version can scale (to a certain degree).

Collapse
Posted by Don Baccus on
If anything it should be more likely to use a unique index than a non-unique index because that tells it that only one value will be returned (rather than some arbitrarily large subset of the table).

Can you post some actual code and plan EXPLAINations for us to look at?

If we can't figure out why it didn't use the original index in your insert we can post to the postgres mailing list.

It does use it for selects on openacs.org:

openacs.org=# explain select * from parties where email = 'mailto:dhogaza@pacifier.com';;
NOTICE:  QUERY PLAN:

Index Scan using parties_email_un on parties  (cost=0.00..5.98 rows=1 width=56)

EXPLAIN

Collapse
Posted by Malte Sussdorff on
Well, I dropped the index that i created and issued the same explain command again, and suddenly it didn't say "seq scan on parties" anymore. Really strange. Well, apparently the fault was on my side, so sorry for posting this. Though I still don't get it....
Collapse
Posted by Claudio Pasolini on
I have two (almost) identical servers running the same query: the first uses the index while the second scans sequentially one of the (big) joined tables.

I dropped and recreated the second db from a pg_dump of the first getting the same results.

Fortunately the first is the production server and so, after some unsuccesfull investigation, I decided to forget the problem.

Collapse
Posted by Don Baccus on
Malte ... I'm sure it has to do with when you ran ANALYZE ... when the contents of parties gets large enough it will switch to using the index.
Collapse
Posted by Tom Ayles on
I believe that sometimes Postgres will not bother with an index scan on a table with only a few rows in it, as in that case a sequential scan is faster. Could it be that differing amounts of data in the table is causing the optimizer to use different plans?