Forum OpenACS Q&A: Response to What expectations/settings should I have for Postgres?

Is there a way you can get distinct rows in the first query without actually using distinct?

Yes, Don mentioned a way above, the query I used was a nasty mess, I cleaned it up and got it to work, but I wanted to learn what was going on and learning I am!

does the count(*) improve after you run it a second time, i.e. after presumably the data is already in memory?

No it does not seem to improve. Overall on this system when I run queries they do not seem to be any faster the next time around. It seemed to improve a bit with a higher value for the -B setting for PG.

Have you done a "vacuum analyze"?

Yep did that, do it every night too.

Do you have indices created on functions?

No indices on this table yet, the table does not have a primary key yet. This query is step 1 in fixing that problem. I am not sure how one would place an index on a function.

Why can't you just do this? select count(*), sid, membid, date, time from logs group by sid, membid, date, time having count(*) > 1

I did do that and it ran, I was just very curious as to why it failed. It took a long time, about 3 minutes to run the query above but that is fine.

The whole reason I had to run this query is because this table has no primary key and I needed to figure out how to select every distinct row. Winds up there is no way to distiguish a row, there are massive amounts of duplicate values, so time for me to fix that problem. This is the joy of adopting an ill designed table, now I get to clean it up.

I never realized how poorly configured this box was for PG. I have done what Don suggested and moved the shared memory buffer up to 3000 * 8192, 24mb. Evidently the kernel for this box is only set to handle 24mb or so as a max. I really want to learn more about this, it seems very important. Now that I am getting more and more into PG the more I want to understand it. Where can I get the in depth details on the -S and -B swicthes for PG and the linux kernel settings? I need some resources on what these different memory areas are for etc.

Thanks for the advice, it was great.

-Jeff