My schema deals with "parties" that fight "battles," and I want to
see what percent of battles each party has won. The battles table
looks like this:
create table battles (
attacker integer not null references parties(id),
defender integer not null references parties(id),
winner integer not null check (winner in
(attacker, defender, 0)),
...
);
I made a view for the information I want so I don't have to reproduce
the query on more than one page:
create view won_percent_raw_v as
select p.id, coalesce(
(select 100*sum(case when winner = p.id then 1 else 0 end) / count
(*)
from battles b
where p.id in (b.attacker, b.defender))
, 0) as won_percent_raw
from parties p;
This is taking roughly 0.1 seconds per aggregate, or 5 seconds for
the 50 parties (and 60k battles). Here's the explain:
explain select * from won_percent_raw_v;
NOTICE: QUERY PLAN:
Seq Scan on parties p (cost=0.00..1.43 rows=43 width=4)
SubPlan
-> Aggregate (cost=1472.20..1472.20 rows=1 width=4)
-> Index Scan using battles_attacker_i, battles_defender_i
on battles b (cost=0.00..1460.21 rows=2397 width=4)
-> Aggregate (cost=1472.20..1472.20 rows=1 width=4)
-> Index Scan using battles_attacker_i, battles_defender_i
on battles b (cost=0.00..1460.21 rows=2397 width=4)
I'm wondering if there is a better way to do this, because it's
obviously not scaling very well. (Select * is actually a query I
need to do to compute the global stats, so I can't just narrow down
which parties I'm computing.) I did try getting rid of the subselect
and instead doing
reate view won_percent_raw_v as
select p.id,
coalesce(sum(case when winner = p.id then 1 else 0 end) / count
(*), 0)
from parties p, battles b
where p.id in (b.attacker, b.defender)
group by p.id;
but that gives the same query plan with an added sort due to the
group by.