Forum OpenACS Q&A: Re: Openacs.org having problems. Refrain from using it until Sept 25th

Good news: Bug-tracker is working, the upgrade was successful after I manually executed the bug-tracker apm-upgrade callbacks and made some fixes to the upgrade scripts.

Bad news: Unusable due to performance issues. I will post the explain plans soon, but in this state it is safe to say we can't upgrade.

Collapse
Posted by Malte Sussdorff on
This is only the *first* query to run for displaying *one* single bug.

upgrade-oo-5=> explain analyze select q.*,
upgrade-oo-5->        km.keyword_id,
upgrade-oo-5->        assign_info.*
upgrade-oo-5-> from (
upgrade-oo-5(>  select b.bug_id,
upgrade-oo-5(>          b.bug_number,
upgrade-oo-5(>          b.summary,
upgrade-oo-5(>          lower(b.summary) as lower_summary,
upgrade-oo-5(>          b.comment_content,
upgrade-oo-5(>          b.comment_format,
upgrade-oo-5(>          b.component_id,
upgrade-oo-5(>          b.creation_date,
upgrade-oo-5(>          to_char(b.creation_date, 'fmMM/DDfm/YYYY') as creation_date_pretty,
upgrade-oo-5(>          b.creation_user as submitter_user_id,
upgrade-oo-5(>          submitter.first_names as submitter_first_names,
upgrade-oo-5(>          submitter.last_name as submitter_last_name,
upgrade-oo-5(>          submitter.email as submitter_email,
upgrade-oo-5(>          lower(submitter.first_names) as lower_submitter_first_names,
upgrade-oo-5(>          lower(submitter.last_name) as lower_submitter_last_name,
upgrade-oo-5(>          lower(submitter.email) as lower_submitter_email,
upgrade-oo-5(>          st.pretty_name as pretty_state,
upgrade-oo-5(>          st.short_name as state_short_name,
upgrade-oo-5(>          st.state_id,
upgrade-oo-5(>          st.hide_fields,
upgrade-oo-5(>          b.resolution,
upgrade-oo-5(>          b.found_in_version,
upgrade-oo-5(>          b.fix_for_version,
upgrade-oo-5(>          b.fixed_in_version,
upgrade-oo-5(>          cas.case_id
upgrade-oo-5(>
upgrade-oo-5(>    from bt_bugs b,
upgrade-oo-5(>          acs_users_all submitter,
upgrade-oo-5(>          acs_users_all assignee,
upgrade-oo-5(>          workflow_cases cas,
upgrade-oo-5(>          workflow_case_fsm cfsm,
upgrade-oo-5(>          workflow_fsm_states st
upgrade-oo-5(>    where submitter.user_id = b.creation_user
upgrade-oo-5(>      and cas.workflow_id = '206311'
upgrade-oo-5(>      and cas.object_id = b.bug_id
upgrade-oo-5(>      and cfsm.case_id = cas.case_id
upgrade-oo-5(>      and cfsm.parent_enabled_action_id is null
upgrade-oo-5(>      and st.state_id = cfsm.current_state
upgrade-oo-5(>
upgrade-oo-5(>    and cfsm.current_state = '10'
upgrade-oo-5(> ) q
upgrade-oo-5-> left outer join
upgrade-oo-5->  cr_item_keyword_map km
upgrade-oo-5-> on (bug_id = km.item_id)
upgrade-oo-5-> left outer join
upgrade-oo-5->  (select cru.user_id as assigned_user_id,
upgrade-oo-5(>          aa.action_id,
upgrade-oo-5(>          aa.case_id,
upgrade-oo-5(>          wa.pretty_name as action_pretty_name,
upgrade-oo-5(>          assignee.first_names as assignee_first_names,
upgrade-oo-5(>          assignee.last_name as assignee_last_name
upgrade-oo-5(>      from workflow_case_assigned_actions aa,
upgrade-oo-5(>          workflow_case_role_user_map cru,
upgrade-oo-5(>          workflow_actions wa,
upgrade-oo-5(>          acs_users_all assignee
upgrade-oo-5(>    where aa.case_id = cru.case_id
upgrade-oo-5(>      and aa.role_id = cru.role_id
upgrade-oo-5(>      and cru.user_id = assignee.user_id
upgrade-oo-5(>      and wa.action_id = aa.action_id
upgrade-oo-5(>  ) assign_info
upgrade-oo-5-> on (q.case_id = assign_info.case_id)
upgrade-oo-5->    order by bug_number desc;
                                                                                                  QUERY PLAN
----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
Sort  (cost=61090.28..61211.08 rows=48319 width=932) (actual time=13986.656..15343.380 rows=288864 loops=1)
  Sort Key: b.bug_number
  ->  Hash Left Join  (cost=2789.18..4940.99 rows=48319 width=932) (actual time=639.752..11525.309 rows=288864 loops=1)
        Hash Cond: ("outer".case_id = "inner".case_id)
        ->  Hash Left Join  (cost=1135.64..2441.64 rows=48319 width=666) (actual time=172.199..5360.479 rows=288864 loops=1)
              Hash Cond: ("outer".bug_id = "inner".item_id)
              ->  Nested Loop  (cost=1020.33..1400.22 rows=16106 width=662) (actual time=121.938..1925.355 rows=96288 loops=1)
                    ->  Nested Loop  (cost=0.00..57.77 rows=2 width=662) (actual time=0.262..3.768 rows=12 loops=1)
                          ->  Nested Loop  (cost=0.00..53.37 rows=1 width=670) (actual time=0.223..3.129 rows=12 loops=1)
                                ->  Nested Loop  (cost=0.00..52.21 rows=1 width=619) (actual time=0.178..2.463 rows=12 loops=1)
                                      ->  Nested Loop  (cost=0.00..48.57 rows=1 width=596) (actual time=0.141..1.896 rows=12 loops=1)
                                            ->  Nested Loop  (cost=0.00..41.49 rows=2 width=569) (actual time=0.106..1.341 rows=12 loops=1)
                                                  ->  Nested Loop  (cost=0.00..35.61 rows=1 width=12) (actual time=0.070..0.777 rows=12 loops=1)
                                                        ->  Index Scan using wf_case_fsm_state_idx on workflow_case_fsm cfsm  (cost=0.00..32.58 rows=1 width=8) (actual time=0.026..0.191 rows=12 loops=1)
                                                              Index Cond: (current_state = 10)
                                                              Filter: (parent_enabled_action_id IS NULL)
                                                        ->  Index Scan using workflow_cases_pk on workflow_cases cas  (cost=0.00..3.02 rows=1 width=8) (actual time=0.021..0.027 rows=1 loops=12)
                                                              Index Cond: ("outer".case_id = cas.case_id)
                                                              Filter: (workflow_id = 206311)
                                                  ->  Index Scan using bt_bug_pk on bt_bugs b  (cost=0.00..5.86 rows=1 width=561) (actual time=0.013..0.020 rows=1 loops=12)
                                                        Index Cond: ("outer".object_id = b.bug_id)
                                            ->  Index Scan using parties_pk on parties pa  (cost=0.00..3.53 rows=1 width=27) (actual time=0.014..0.020 rows=1 loops=12)
                                                  Index Cond: ("outer".creation_user = pa.party_id)
                                      ->  Index Scan using persons_pk on persons pe  (cost=0.00..3.63 rows=1 width=23) (actual time=0.013..0.019 rows=1 loops=12)
                                            Index Cond: ("outer".creation_user = pe.person_id)
                                ->  Seq Scan on workflow_fsm_states st  (cost=0.00..1.15 rows=1 width=55) (actual time=0.011..0.021 rows=1 loops=12)
                                      Filter: (10 = state_id)
                          ->  Index Scan using users_pk on users u  (cost=0.00..4.39 rows=1 width=4) (actual time=0.013..0.024 rows=1 loops=12)
                                Index Cond: (u.user_id = "outer".party_id)
                    ->  Materialize  (cost=1020.33..1100.86 rows=8053 width=0) (actual time=10.141..58.123 rows=8024 loops=12)
                          ->  Hash Join  (cost=329.57..1020.33 rows=8053 width=0) (actual time=121.638..322.682 rows=8024 loops=1)
                                Hash Cond: ("outer".person_id = "inner".party_id)
                                ->  Hash Join  (cost=158.64..648.08 rows=8052 width=8) (actual time=60.693..180.701 rows=8024 loops=1)
                                      Hash Cond: ("outer".user_id = "inner".person_id)
                                      ->  Seq Scan on users u  (cost=0.00..280.57 rows=8557 width=4) (actual time=0.006..34.454 rows=8024 loops=1)
                                      ->  Hash  (cost=138.51..138.51 rows=8051 width=4) (actual time=60.636..60.636 rows=0 loops=1)
                                            ->  Seq Scan on persons pe  (cost=0.00..138.51 rows=8051 width=4) (actual time=0.007..31.320 rows=8024 loops=1)
                                ->  Hash  (cost=150.75..150.75 rows=8075 width=4) (actual time=60.900..60.900 rows=0 loops=1)
                                      ->  Seq Scan on parties pa  (cost=0.00..150.75 rows=8075 width=4) (actual time=0.007..31.528 rows=8045 loops=1)
              ->  Hash  (cost=98.85..98.85 rows=6585 width=8) (actual time=50.218..50.218 rows=0 loops=1)
                    ->  Seq Scan on cr_item_keyword_map km  (cost=0.00..98.85 rows=6585 width=8) (actual time=0.013..26.073 rows=6585 loops=1)
        ->  Hash  (cost=1653.53..1653.53 rows=1 width=266) (actual time=467.482..467.482 rows=0 loops=1)
              ->  Subquery Scan assign_info  (cost=946.14..1653.53 rows=1 width=266) (actual time=467.474..467.474 rows=0 loops=1)
                    ->  Hash Join  (cost=946.14..1653.52 rows=1 width=44) (actual time=467.466..467.466 rows=0 loops=1)
                          Hash Cond: ("outer".user_id = COALESCE("inner".deputy_user_id, "inner".user_id))
                          ->  Hash Join  (cost=502.90..1170.00 rows=8053 width=31) (actual time=122.691..321.421 rows=8024 loops=1)
                                Hash Cond: ("outer".person_id = "inner".party_id)
                                ->  Hash Join  (cost=331.96..797.76 rows=8052 width=27) (actual time=61.971..172.555 rows=8024 loops=1)
                                      Hash Cond: ("outer".person_id = "inner".user_id)
                                      ->  Seq Scan on persons pe  (cost=0.00..138.51 rows=8051 width=23) (actual time=0.007..29.966 rows=8024 loops=1)
                                      ->  Hash  (cost=280.57..280.57 rows=8557 width=4) (actual time=61.847..61.847 rows=0 loops=1)
                                            ->  Seq Scan on users u  (cost=0.00..280.57 rows=8557 width=4) (actual time=0.010..32.245 rows=8024 loops=1)
                                ->  Hash  (cost=150.75..150.75 rows=8075 width=4) (actual time=60.683..60.683 rows=0 loops=1)
                                      ->  Seq Scan on parties pa  (cost=0.00..150.75 rows=8075 width=4) (actual time=0.007..31.467 rows=8045 loops=1)
                          ->  Hash  (cost=443.24..443.24 rows=1 width=25) (actual time=117.214..117.214 rows=0 loops=1)
                                ->  Hash Join  (cost=441.81..443.24 rows=1 width=25) (actual time=117.208..117.208 rows=0 loops=1)
                                      Hash Cond: ("outer".action_id = "inner".action_id)
                                      ->  Seq Scan on workflow_actions wa  (cost=0.00..1.28 rows=28 width=13) (actual time=0.006..0.126 rows=28 loops=1)
                                      ->  Hash  (cost=441.81..441.81 rows=2 width=20) (actual time=116.975..116.975 rows=0 loops=1)
                                            ->  Hash Join  (cost=75.63..441.81 rows=2 width=20) (actual time=116.968..116.968 rows=0 loops=1)
                                                  Hash Cond: ("outer".user_id = "inner".member_id)
                                                  ->  Hash Left Join  (cost=0.00..323.38 rows=8557 width=8) (actual time=0.036..88.308 rows=8024 loops=1)
                                                        Hash Cond: ("outer".user_id = "inner".user_id)
                                                        ->  Seq Scan on users u  (cost=0.00..280.57 rows=8557 width=4) (actual time=0.006..30.137 rows=8024 loops=1)
                                                        ->  Hash  (cost=0.00..0.00 rows=1 width=8) (actual time=0.010..0.010 rows=0 loops=1)
                                                              ->  Seq Scan on workflow_deputies dep  (cost=0.00..0.00 rows=1 width=8) (actual time=0.004..0.004 rows=0 loops=1)
                                                                    Filter: ((('now'::text)::timestamp(6) with time zone >= start_date) AND (('now'::text)::timestamp(6) with time zone <= end_date))
                                                  ->  Hash  (cost=75.62..75.62 rows=1 width=16) (actual time=0.301..0.301 rows=0 loops=1)
                                                        ->  Nested Loop  (cost=1.35..75.62 rows=1 width=16) (actual time=0.294..0.294 rows=0 loops=1)
                                                              ->  Nested Loop  (cost=1.35..10.07 rows=1 width=16) (actual time=0.288..0.288 rows=0 loops=1)
                                                                    ->  Nested Loop  (cost=1.35..4.46 rows=1 width=20) (actual time=0.280..0.280 rows=0 loops=1)
                                                                          ->  Hash Join  (cost=1.35..1.43 rows=1 width=16) (actual time=0.274..0.274 rows=0 loops=1)
                                                                                Hash Cond: ("outer".action_id = "inner".action_id)
                                                                                ->  Seq Scan on workflow_case_enabled_actions wcea  (cost=0.00..0.00 rows=1 width=8) (actual time=0.003..0.003 rows=0 loops=1)
                                                                                      Filter: ((completed_p = false) AND (assigned_p = true))
                                                                                ->  Hash  (cost=1.28..1.28 rows=28 width=8) (actual time=0.252..0.252 rows=0 loops=1)
                                                                                      ->  Seq Scan on workflow_actions wa  (cost=0.00..1.28 rows=28 width=8) (actual time=0.013..0.130 rows=28 loops=1)
                                                                          ->  Index Scan using workflow_cases_pk on workflow_cases c  (cost=0.00..3.01 rows=1 width=4) (never executed)
                                                                                Index Cond: (c.case_id = "outer".case_id)
                                                                    ->  Index Scan using wf_case_role_pty_map_pk on workflow_case_role_party_map wcrpm  (cost=0.00..5.60 rows=1 width=12) (never executed)
                                                                          Index Cond: ((wcrpm.case_id = "outer".case_id) AND ("outer".assigned_role = wcrpm.role_id))
                                                              ->  Index Scan using party_approved_member_map_pk on party_approved_member_map pamm  (cost=0.00..65.34 rows=17 width=8) (never executed)
                                                                    Index Cond: (pamm.party_id = "outer".party_id)
Total runtime: 16444.930 ms
(84 rows)

Collapse
Posted by Jade Rubick on
A couple of things I'd start with:

Replace acs_users_all with a join on persons and parties. That removes the join on the users table, which should give us a but measurable improvement, especially since there are a couple sequential scans on persons, parties, and users.

I'd like to look at the query to figure out what's going on with the subselects in the from statement. I've had to do that a couple times in the past, but maybe in this case it isn't necessary.

Collapse
Posted by Malte Sussdorff on
The site currently runs at http://sussdorff.dyndns.org:8000/. As I need to do it anyway at one point in time, I will copy the whole code along with the database snapshot over to samoyed, so people who have access to that machine can go ahead and play around.
I installed AOLserver and Postgres on samoyed. Furthermore there is a file at /var/lib/aolserver called samoyed.tgz, which contains the code and the database file(s) that make up my current installation of openacs.org. As it takes some time to upload, expect it to be available at 8pm GMT. If you are keen to work on it immediately, just unpack it and load the file upgrade-3.dmp into postgresql (user upgrade-oo-5 and database upgrade-oo-5 already created). Good luck with tuning and thanks for taking this on Jade.
URL changed. check out http://samoyed.furfly.net :)
Collapse
Posted by Andrew Piskorski on
The SourceForge bug tracker sucks fairly badly; I don't particularly recommend that OpenACS use it.

It would also be really annoying (and rather embarassing to the project) to see all the old bug data in the current bug tracker simple abandoned. Nearly as bad would be to keep the "old" openacs.org site running solely in order to access the old bug tracker. OpenACS already did both those things once when converting openacs.org from OpenACS 3.x to 4.x - just plain ugly.

I would like to hear Lars' opinion of Bug Tracker, as well as anyone else who is familiar with the Bug Tracker internals. Is Bug Tracker basically fine and just needs some query tuning? Does it suffer from serious internal design problems? Is it too complicated? Or what?

Since Malte already figured out how to upgrade the openacs.org Bug Tracker, tuning the queries seems fairly likely to be both the least effort and best result path. Major rewrites of Bug Tracker are probably premature until someone at least tries seriously to figure out why those queries suddenly suck so badly, despite being ok on the old (current) openacs.org site.