Hi everybody,
I've been lately working on bug-tracker to make it more accessible to users, as we are using it as an error report tool. However, in the way things are now, it's completely impossible for them to just access the index page, because it's taking incredible 50 seconds (yes, fifty seconds) to load.
In order to improve it, I've just recently upgraded to the newest version on HEAD. I've found some small bugs, such as upgrade scripts on wrong folders, and some other issues, and I've also realized someone has been doing some work on it. There were three very slow queries, wich seem to be reduced to only one right now. The query I'm talking about must look like this:
select q.*,
km.keyword_id,
assign_info.*
from (
select b.bug_id,
b.project_id,
b.bug_number,
b.summary,
lower(b.summary) as lower_summary,
b.comment_content,
b.comment_format,
b.component_id,
to_char(b.creation_date,'YYYY-MM-DD HH24:MI:SS') as creation_date,
to_char(b.creation_date, 'YYYY-MM-DD HH24:MI:SS') as creation_date_pretty,
b.creation_user as submitter_user_id,
submitter.first_names as submitter_first_names,
submitter.last_name as submitter_last_name,
submitter.email as submitter_email,
lower(submitter.first_names) as lower_submitter_first_names,
lower(submitter.last_name) as lower_submitter_last_name,
lower(submitter.email) as lower_submitter_email,
st.pretty_name as pretty_state,
st.short_name as state_short_name,
st.state_id,
st.hide_fields,
b.resolution,
b.found_in_version,
b.fix_for_version,
b.fixed_in_version,
cas.case_id
from bt_bugs b,
acs_users_all submitter,
workflow_cases cas,
workflow_case_fsm cfsm,
workflow_fsm_states st
where submitter.user_id = b.creation_user
and cas.workflow_id = '1500160'
and cas.object_id = b.bug_id
and cfsm.case_id = cas.case_id
and cfsm.parent_enabled_action_id is null
and st.state_id = cfsm.current_state
and cfsm.current_state = '4'
) q
left outer join
cr_item_keyword_map km
on (bug_id = km.item_id)
left outer join
(select cru.user_id as assigned_user_id,
aa.action_id,
aa.case_id,
wa.pretty_name as action_pretty_name,
p.first_names as assignee_first_names,
p.last_name as assignee_last_name
from workflow_case_assigned_actions aa,
workflow_case_role_user_map cru,
workflow_actions wa,
persons p
where aa.case_id = cru.case_id
and aa.role_id = cru.role_id
and cru.user_id = p.person_id
and wa.action_id = aa.action_id
) assign_info
on (q.case_id = assign_info.case_id)
order by bug_number desc
I've put it on explain analyze and this is what I've found:
"Nested Loop Left Join (cost=6549.55..113191.28 rows=221 width=3204) (actual time=900.479..41514.489 rows=35 loops=1)"
" Join Filter: (cas.case_id = assign_info.case_id)"
" -> Nested Loop (cost=4.95..159.62 rows=1 width=2338) (actual time=0.408..38.840 rows=35 loops=1)"
" -> Nested Loop (cost=4.95..158.53 rows=1 width=1186) (actual time=0.403..38.408 rows=35 loops=1)"
" -> Nested Loop (cost=4.95..152.23 rows=1 width=1168) (actual time=0.392..37.839 rows=35 loops=1)"
" -> Nested Loop (cost=4.95..145.99 rows=1 width=1140) (actual time=0.380..37.314 rows=35 loops=1)"
" -> Nested Loop (cost=4.95..139.34 rows=1 width=1136) (actual time=0.365..36.695 rows=35 loops=1)"
" Join Filter: (cfsm.case_id = cas.case_id)"
" -> Nested Loop Left Join (cost=4.95..134.38 rows=1 width=1132) (actual time=0.310..21.372 rows=236 loops=1)"
" -> Nested Loop (cost=4.95..134.05 rows=1 width=1128) (actual time=0.296..20.012 rows=236 loops=1)"
" Join Filter: (cas.object_id = b.bug_id)"
" -> Index Scan Backward using bt_bugs_bug_number_idx on bt_bugs b (cost=0.00..123.79 rows=236 width=1124) (actual time=0.021..0.369 rows=236 loops=1)"
" -> Materialize (cost=4.95..4.96 rows=1 width=8) (actual time=0.000..0.032 rows=236 loops=236)"
" -> Seq Scan on workflow_cases cas (cost=0.00..4.95 rows=1 width=8) (actual time=0.012..0.097 rows=236 loops=1)"
" Filter: (workflow_id = 1500160)"
" -> Index Scan using cr_item_keyword_map_pk on cr_item_keyword_map km (cost=0.00..0.32 rows=1 width=8) (actual time=0.004..0.004 rows=1 loops=236)"
" Index Cond: (b.bug_id = km.item_id)"
" -> Seq Scan on workflow_case_fsm cfsm (cost=0.00..4.95 rows=1 width=8) (actual time=0.007..0.056 rows=35 loops=236)"
" Filter: ((parent_enabled_action_id IS NULL) AND (current_state = 4))"
" -> Index Scan using users_pk on users u (cost=0.00..6.63 rows=1 width=4) (actual time=0.012..0.014 rows=1 loops=35)"
" Index Cond: (u.user_id = b.creation_user)"
" -> Index Scan using persons_pk on persons pe (cost=0.00..6.22 rows=1 width=28) (actual time=0.010..0.012 rows=1 loops=35)"
" Index Cond: (pe.person_id = u.user_id)"
" -> Index Scan using parties_pk on parties pa (cost=0.00..6.29 rows=1 width=30) (actual time=0.010..0.013 rows=1 loops=35)"
" Index Cond: (pa.party_id = pe.person_id)"
" -> Seq Scan on workflow_fsm_states st (cost=0.00..1.07 rows=1 width=1156) (actual time=0.003..0.008 rows=1 loops=35)"
" Filter: (4 = state_id)"
" -> Hash Join (cost=6544.60..111857.00 rows=52060 width=458) (actual time=400.595..1184.905 rows=119 loops=35)"
" Hash Cond: (COALESCE(dep.deputy_user_id, u.user_id) = p.person_id)"
" -> Hash Join (cost=4221.18..108297.15 rows=52060 width=434) (actual time=398.557..1182.511 rows=119 loops=35)"
" Hash Cond: ((wcrpm.case_id = wcea.case_id) AND (wcrpm.role_id = wa.assigned_role))"
" -> Hash Join (cost=4183.83..93405.47 rows=716687 width=16) (actual time=398.522..1182.158 rows=472 loops=35)"
" Hash Cond: (pamm.member_id = u.user_id)"
" -> Merge Join (cost=29.68..72230.01 rows=716687 width=12) (actual time=395.941..1178.950 rows=472 loops=35)"
" Merge Cond: (pamm.party_id = wcrpm.party_id)"
" -> Index Scan using party_approved_member_map_pk on party_approved_member_map pamm (cost=0.00..59210.21 rows=1066296 width=8) (actual time=0.014..933.179 rows=1059493 loops=35)"
" -> Sort (cost=29.68..30.86 rows=472 width=12) (actual time=0.013..0.128 rows=472 loops=35)"
" Sort Key: wcrpm.party_id"
" -> Seq Scan on workflow_case_role_party_map wcrpm (cost=0.00..8.72 rows=472 width=12) (actual time=0.009..0.151 rows=472 loops=1)"
" -> Hash (cost=3222.25..3222.25 rows=74552 width=8) (actual time=90.017..90.017 rows=74552 loops=1)"
" -> Hash Left Join (cost=13.00..3222.25 rows=74552 width=8) (actual time=0.028..56.866 rows=74552 loops=1)"
" Hash Cond: (u.user_id = dep.user_id)"
" -> Seq Scan on users u (cost=0.00..2929.52 rows=74552 width=4) (actual time=0.011..23.485 rows=74552 loops=1)"
" -> Hash (cost=12.80..12.80 rows=16 width=8) (actual time=0.002..0.002 rows=0 loops=1)"
" -> Seq Scan on workflow_deputies dep (cost=0.00..12.80 rows=16 width=8) (actual time=0.001..0.001 rows=0 loops=1)"
" Filter: ((now() >= start_date) AND (now() <= end_date))"
" -> Hash (cost=33.74..33.74 rows=240 width=434) (actual time=0.994..0.994 rows=240 loops=1)"
" -> Hash Join (cost=10.13..33.74 rows=240 width=434) (actual time=0.300..0.821 rows=240 loops=1)"
" Hash Cond: (wcea.action_id = wa.action_id)"
" -> Hash Join (cost=7.31..27.62 rows=240 width=12) (actual time=0.212..0.592 rows=240 loops=1)"
" Hash Cond: (wcea.case_id = c.case_id)"
" -> Seq Scan on workflow_case_enabled_actions wcea (cost=0.00..17.01 rows=240 width=8) (actual time=0.013..0.263 rows=240 loops=1)"
" Filter: ((NOT completed_p) AND assigned_p)"
" -> Hash (cost=4.36..4.36 rows=236 width=4) (actual time=0.184..0.184 rows=236 loops=1)"
" -> Seq Scan on workflow_cases c (cost=0.00..4.36 rows=236 width=4) (actual time=0.009..0.068 rows=236 loops=1)"
" -> Hash (cost=2.65..2.65 rows=14 width=430) (actual time=0.075..0.075 rows=14 loops=1)"
" -> Hash Join (cost=1.32..2.65 rows=14 width=430) (actual time=0.044..0.057 rows=14 loops=1)"
" Hash Cond: (wa.action_id = wa.action_id)"
" -> Seq Scan on workflow_actions wa (cost=0.00..1.14 rows=14 width=8) (actual time=0.006..0.008 rows=14 loops=1)"
" -> Hash (cost=1.14..1.14 rows=14 width=422) (actual time=0.020..0.020 rows=14 loops=1)"
" -> Seq Scan on workflow_actions wa (cost=0.00..1.14 rows=14 width=422) (actual time=0.004..0.011 rows=14 loops=1)"
" -> Hash (cost=1391.52..1391.52 rows=74552 width=28) (actual time=71.027..71.027 rows=74552 loops=1)"
" -> Seq Scan on persons p (cost=0.00..1391.52 rows=74552 width=28) (actual time=0.009..24.650 rows=74552 loops=1)"
"Total runtime: 41514.945 ms"
It seems like this query has two big troubles:
1 - The Index Scan on party_approved_member_map is incredibly slow.
2 - The Hash Join with the users table is also taking a long time
Most of the problems come from the huge amount of users we have (around 65.000), but I guess this performance can be improved. I've tried some configurations tuning on postgresql.conf, but I wasn't very successfull. So, can anybody give me a help on that? I believe there are other systems with the same amount of users using bug-tracker, and I don't see how it can be fast enough in order to be useable.
Any help is appreciated.