Forum OpenACS Q&A: Re: Openacs.org having problems. Refrain from using it until Sept 25th

Robots.txt has been installed. ETP index.vuh has been installed but ETP does not work as on this site.

As for bug-tracker the problem is that the 0.9-1.2 upgrade script does not work. Which results in the bugs not being imported though everything else. Currently I'm too lazy for debugging this.

Last but not least I won't have time to work on this til middle of November, which sucks, but there is not much I can do to help it.

Suggestions: If someone wants to work on it I can give access on my computer. Just send me a mail.

What needs to be done is ETP (get everything in place, so the pages display something that resembles openacs.org). And to make the upgrade work for bug-tracker. Apparently the way described above had some problem, which is why it failed.

Collapse
Posted by Jeff Davis on
I think it boils down to the notifications sweeper. notification::sweep::sweep_notifications.select_notifications:
           select notification_id,
                   notif_subject,
                   notif_text,
                   notif_html,
                   user_id,
                   type_id,
                   delivery_method_id,
                   response_id,
                   notif_date
            from notifications inner join notification_requests using (type_id, object_id)
              inner join acs_objects on (notification_requests.request_id = acs_objects.object_id)
              left outer join notification_user_map using (notification_id, user_id)
            where sent_date is null
              and creation_date <= notif_date
              and interval_id = 2889
          order by user_id, type_id, notif_date
Has this plan.
Sort  (cost=10373.08..10373.08 rows=2900 width=832) (actual time=603170.17..603170.17 rows=0 loops=1)
  ->  Hash Join  (cost=4815.63..9591.27 rows=2900 width=832) (actual time=603170.08..603170.08 rows=0 loops=1)
        ->  Hash Join  (cost=4622.03..8021.80 rows=2900 width=816) (actual time=1838.57..43845.84 rows=175853 loops=1)
              ->  Merge Join  (cost=1145.89..2066.65 rows=8700 width=804) (actual time=90.50..5933.08 rows=184280 loops=1)
                    ->  Index Scan using notification_requests_t_o_idx on notification_requests  (cost=0.00..751.84 rows=13320 width=20) (actual time=24.68..4322.84 rows=13182 loops=1)
                    ->  Sort  (cost=1145.89..1145.89 rows=2044 width=784) (actual time=50.90..568.95 rows=184276 loops=1)
                          ->  Seq Scan on notifications  (cost=0.00..627.44 rows=2044 width=784) (actual time=0.04..24.12 rows=2044 loops=1)
              ->  Hash  (cost=2758.11..2758.11 rows=104011 width=12) (actual time=499.13..499.13 rows=0 loops=1)
                    ->  Seq Scan on acs_objects  (cost=0.00..2758.11 rows=104011 width=12) (actual time=0.06..349.11 rows=110383 loops=1)
        ->  Hash  (cost=168.08..168.08 rows=10208 width=16) (actual time=7225.23..7225.23 rows=0 loops=1)
              ->  Seq Scan on notification_user_map  (cost=0.00..168.08 rows=10208 width=16) (actual time=24.26..6596.30 rows=290920 loops=1)
Total runtime: 603368.41 msec
Notice the total runtime there. Yikes!

I stopped openacs.org and ran this by itself with no notifications in the table and the backend required 240mb and took about 8 minutes to return (with the server at load average 0.1 when I started) so it's not a problem with locking or anything going on in the db. The query just really sucks.

I suspect the problem is the notifications table has just grown too big since it's not being purged and I vaguely remember Don maybe having cleaned it out before. Is that right, Don?

Good news: Bug-tracker is working, the upgrade was successful after I manually executed the bug-tracker apm-upgrade callbacks and made some fixes to the upgrade scripts.

Bad news: Unusable due to performance issues. I will post the explain plans soon, but in this state it is safe to say we can't upgrade.

Collapse
Posted by Malte Sussdorff on
This is only the *first* query to run for displaying *one* single bug.

upgrade-oo-5=> explain analyze select q.*,
upgrade-oo-5->        km.keyword_id,
upgrade-oo-5->        assign_info.*
upgrade-oo-5-> from (
upgrade-oo-5(>  select b.bug_id,
upgrade-oo-5(>          b.bug_number,
upgrade-oo-5(>          b.summary,
upgrade-oo-5(>          lower(b.summary) as lower_summary,
upgrade-oo-5(>          b.comment_content,
upgrade-oo-5(>          b.comment_format,
upgrade-oo-5(>          b.component_id,
upgrade-oo-5(>          b.creation_date,
upgrade-oo-5(>          to_char(b.creation_date, 'fmMM/DDfm/YYYY') as creation_date_pretty,
upgrade-oo-5(>          b.creation_user as submitter_user_id,
upgrade-oo-5(>          submitter.first_names as submitter_first_names,
upgrade-oo-5(>          submitter.last_name as submitter_last_name,
upgrade-oo-5(>          submitter.email as submitter_email,
upgrade-oo-5(>          lower(submitter.first_names) as lower_submitter_first_names,
upgrade-oo-5(>          lower(submitter.last_name) as lower_submitter_last_name,
upgrade-oo-5(>          lower(submitter.email) as lower_submitter_email,
upgrade-oo-5(>          st.pretty_name as pretty_state,
upgrade-oo-5(>          st.short_name as state_short_name,
upgrade-oo-5(>          st.state_id,
upgrade-oo-5(>          st.hide_fields,
upgrade-oo-5(>          b.resolution,
upgrade-oo-5(>          b.found_in_version,
upgrade-oo-5(>          b.fix_for_version,
upgrade-oo-5(>          b.fixed_in_version,
upgrade-oo-5(>          cas.case_id
upgrade-oo-5(>
upgrade-oo-5(>    from bt_bugs b,
upgrade-oo-5(>          acs_users_all submitter,
upgrade-oo-5(>          acs_users_all assignee,
upgrade-oo-5(>          workflow_cases cas,
upgrade-oo-5(>          workflow_case_fsm cfsm,
upgrade-oo-5(>          workflow_fsm_states st
upgrade-oo-5(>    where submitter.user_id = b.creation_user
upgrade-oo-5(>      and cas.workflow_id = '206311'
upgrade-oo-5(>      and cas.object_id = b.bug_id
upgrade-oo-5(>      and cfsm.case_id = cas.case_id
upgrade-oo-5(>      and cfsm.parent_enabled_action_id is null
upgrade-oo-5(>      and st.state_id = cfsm.current_state
upgrade-oo-5(>
upgrade-oo-5(>    and cfsm.current_state = '10'
upgrade-oo-5(> ) q
upgrade-oo-5-> left outer join
upgrade-oo-5->  cr_item_keyword_map km
upgrade-oo-5-> on (bug_id = km.item_id)
upgrade-oo-5-> left outer join
upgrade-oo-5->  (select cru.user_id as assigned_user_id,
upgrade-oo-5(>          aa.action_id,
upgrade-oo-5(>          aa.case_id,
upgrade-oo-5(>          wa.pretty_name as action_pretty_name,
upgrade-oo-5(>          assignee.first_names as assignee_first_names,
upgrade-oo-5(>          assignee.last_name as assignee_last_name
upgrade-oo-5(>      from workflow_case_assigned_actions aa,
upgrade-oo-5(>          workflow_case_role_user_map cru,
upgrade-oo-5(>          workflow_actions wa,
upgrade-oo-5(>          acs_users_all assignee
upgrade-oo-5(>    where aa.case_id = cru.case_id
upgrade-oo-5(>      and aa.role_id = cru.role_id
upgrade-oo-5(>      and cru.user_id = assignee.user_id
upgrade-oo-5(>      and wa.action_id = aa.action_id
upgrade-oo-5(>  ) assign_info
upgrade-oo-5-> on (q.case_id = assign_info.case_id)
upgrade-oo-5->    order by bug_number desc;
                                                                                                  QUERY PLAN
----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
Sort  (cost=61090.28..61211.08 rows=48319 width=932) (actual time=13986.656..15343.380 rows=288864 loops=1)
  Sort Key: b.bug_number
  ->  Hash Left Join  (cost=2789.18..4940.99 rows=48319 width=932) (actual time=639.752..11525.309 rows=288864 loops=1)
        Hash Cond: ("outer".case_id = "inner".case_id)
        ->  Hash Left Join  (cost=1135.64..2441.64 rows=48319 width=666) (actual time=172.199..5360.479 rows=288864 loops=1)
              Hash Cond: ("outer".bug_id = "inner".item_id)
              ->  Nested Loop  (cost=1020.33..1400.22 rows=16106 width=662) (actual time=121.938..1925.355 rows=96288 loops=1)
                    ->  Nested Loop  (cost=0.00..57.77 rows=2 width=662) (actual time=0.262..3.768 rows=12 loops=1)
                          ->  Nested Loop  (cost=0.00..53.37 rows=1 width=670) (actual time=0.223..3.129 rows=12 loops=1)
                                ->  Nested Loop  (cost=0.00..52.21 rows=1 width=619) (actual time=0.178..2.463 rows=12 loops=1)
                                      ->  Nested Loop  (cost=0.00..48.57 rows=1 width=596) (actual time=0.141..1.896 rows=12 loops=1)
                                            ->  Nested Loop  (cost=0.00..41.49 rows=2 width=569) (actual time=0.106..1.341 rows=12 loops=1)
                                                  ->  Nested Loop  (cost=0.00..35.61 rows=1 width=12) (actual time=0.070..0.777 rows=12 loops=1)
                                                        ->  Index Scan using wf_case_fsm_state_idx on workflow_case_fsm cfsm  (cost=0.00..32.58 rows=1 width=8) (actual time=0.026..0.191 rows=12 loops=1)
                                                              Index Cond: (current_state = 10)
                                                              Filter: (parent_enabled_action_id IS NULL)
                                                        ->  Index Scan using workflow_cases_pk on workflow_cases cas  (cost=0.00..3.02 rows=1 width=8) (actual time=0.021..0.027 rows=1 loops=12)
                                                              Index Cond: ("outer".case_id = cas.case_id)
                                                              Filter: (workflow_id = 206311)
                                                  ->  Index Scan using bt_bug_pk on bt_bugs b  (cost=0.00..5.86 rows=1 width=561) (actual time=0.013..0.020 rows=1 loops=12)
                                                        Index Cond: ("outer".object_id = b.bug_id)
                                            ->  Index Scan using parties_pk on parties pa  (cost=0.00..3.53 rows=1 width=27) (actual time=0.014..0.020 rows=1 loops=12)
                                                  Index Cond: ("outer".creation_user = pa.party_id)
                                      ->  Index Scan using persons_pk on persons pe  (cost=0.00..3.63 rows=1 width=23) (actual time=0.013..0.019 rows=1 loops=12)
                                            Index Cond: ("outer".creation_user = pe.person_id)
                                ->  Seq Scan on workflow_fsm_states st  (cost=0.00..1.15 rows=1 width=55) (actual time=0.011..0.021 rows=1 loops=12)
                                      Filter: (10 = state_id)
                          ->  Index Scan using users_pk on users u  (cost=0.00..4.39 rows=1 width=4) (actual time=0.013..0.024 rows=1 loops=12)
                                Index Cond: (u.user_id = "outer".party_id)
                    ->  Materialize  (cost=1020.33..1100.86 rows=8053 width=0) (actual time=10.141..58.123 rows=8024 loops=12)
                          ->  Hash Join  (cost=329.57..1020.33 rows=8053 width=0) (actual time=121.638..322.682 rows=8024 loops=1)
                                Hash Cond: ("outer".person_id = "inner".party_id)
                                ->  Hash Join  (cost=158.64..648.08 rows=8052 width=8) (actual time=60.693..180.701 rows=8024 loops=1)
                                      Hash Cond: ("outer".user_id = "inner".person_id)
                                      ->  Seq Scan on users u  (cost=0.00..280.57 rows=8557 width=4) (actual time=0.006..34.454 rows=8024 loops=1)
                                      ->  Hash  (cost=138.51..138.51 rows=8051 width=4) (actual time=60.636..60.636 rows=0 loops=1)
                                            ->  Seq Scan on persons pe  (cost=0.00..138.51 rows=8051 width=4) (actual time=0.007..31.320 rows=8024 loops=1)
                                ->  Hash  (cost=150.75..150.75 rows=8075 width=4) (actual time=60.900..60.900 rows=0 loops=1)
                                      ->  Seq Scan on parties pa  (cost=0.00..150.75 rows=8075 width=4) (actual time=0.007..31.528 rows=8045 loops=1)
              ->  Hash  (cost=98.85..98.85 rows=6585 width=8) (actual time=50.218..50.218 rows=0 loops=1)
                    ->  Seq Scan on cr_item_keyword_map km  (cost=0.00..98.85 rows=6585 width=8) (actual time=0.013..26.073 rows=6585 loops=1)
        ->  Hash  (cost=1653.53..1653.53 rows=1 width=266) (actual time=467.482..467.482 rows=0 loops=1)
              ->  Subquery Scan assign_info  (cost=946.14..1653.53 rows=1 width=266) (actual time=467.474..467.474 rows=0 loops=1)
                    ->  Hash Join  (cost=946.14..1653.52 rows=1 width=44) (actual time=467.466..467.466 rows=0 loops=1)
                          Hash Cond: ("outer".user_id = COALESCE("inner".deputy_user_id, "inner".user_id))
                          ->  Hash Join  (cost=502.90..1170.00 rows=8053 width=31) (actual time=122.691..321.421 rows=8024 loops=1)
                                Hash Cond: ("outer".person_id = "inner".party_id)
                                ->  Hash Join  (cost=331.96..797.76 rows=8052 width=27) (actual time=61.971..172.555 rows=8024 loops=1)
                                      Hash Cond: ("outer".person_id = "inner".user_id)
                                      ->  Seq Scan on persons pe  (cost=0.00..138.51 rows=8051 width=23) (actual time=0.007..29.966 rows=8024 loops=1)
                                      ->  Hash  (cost=280.57..280.57 rows=8557 width=4) (actual time=61.847..61.847 rows=0 loops=1)
                                            ->  Seq Scan on users u  (cost=0.00..280.57 rows=8557 width=4) (actual time=0.010..32.245 rows=8024 loops=1)
                                ->  Hash  (cost=150.75..150.75 rows=8075 width=4) (actual time=60.683..60.683 rows=0 loops=1)
                                      ->  Seq Scan on parties pa  (cost=0.00..150.75 rows=8075 width=4) (actual time=0.007..31.467 rows=8045 loops=1)
                          ->  Hash  (cost=443.24..443.24 rows=1 width=25) (actual time=117.214..117.214 rows=0 loops=1)
                                ->  Hash Join  (cost=441.81..443.24 rows=1 width=25) (actual time=117.208..117.208 rows=0 loops=1)
                                      Hash Cond: ("outer".action_id = "inner".action_id)
                                      ->  Seq Scan on workflow_actions wa  (cost=0.00..1.28 rows=28 width=13) (actual time=0.006..0.126 rows=28 loops=1)
                                      ->  Hash  (cost=441.81..441.81 rows=2 width=20) (actual time=116.975..116.975 rows=0 loops=1)
                                            ->  Hash Join  (cost=75.63..441.81 rows=2 width=20) (actual time=116.968..116.968 rows=0 loops=1)
                                                  Hash Cond: ("outer".user_id = "inner".member_id)
                                                  ->  Hash Left Join  (cost=0.00..323.38 rows=8557 width=8) (actual time=0.036..88.308 rows=8024 loops=1)
                                                        Hash Cond: ("outer".user_id = "inner".user_id)
                                                        ->  Seq Scan on users u  (cost=0.00..280.57 rows=8557 width=4) (actual time=0.006..30.137 rows=8024 loops=1)
                                                        ->  Hash  (cost=0.00..0.00 rows=1 width=8) (actual time=0.010..0.010 rows=0 loops=1)
                                                              ->  Seq Scan on workflow_deputies dep  (cost=0.00..0.00 rows=1 width=8) (actual time=0.004..0.004 rows=0 loops=1)
                                                                    Filter: ((('now'::text)::timestamp(6) with time zone >= start_date) AND (('now'::text)::timestamp(6) with time zone <= end_date))
                                                  ->  Hash  (cost=75.62..75.62 rows=1 width=16) (actual time=0.301..0.301 rows=0 loops=1)
                                                        ->  Nested Loop  (cost=1.35..75.62 rows=1 width=16) (actual time=0.294..0.294 rows=0 loops=1)
                                                              ->  Nested Loop  (cost=1.35..10.07 rows=1 width=16) (actual time=0.288..0.288 rows=0 loops=1)
                                                                    ->  Nested Loop  (cost=1.35..4.46 rows=1 width=20) (actual time=0.280..0.280 rows=0 loops=1)
                                                                          ->  Hash Join  (cost=1.35..1.43 rows=1 width=16) (actual time=0.274..0.274 rows=0 loops=1)
                                                                                Hash Cond: ("outer".action_id = "inner".action_id)
                                                                                ->  Seq Scan on workflow_case_enabled_actions wcea  (cost=0.00..0.00 rows=1 width=8) (actual time=0.003..0.003 rows=0 loops=1)
                                                                                      Filter: ((completed_p = false) AND (assigned_p = true))
                                                                                ->  Hash  (cost=1.28..1.28 rows=28 width=8) (actual time=0.252..0.252 rows=0 loops=1)
                                                                                      ->  Seq Scan on workflow_actions wa  (cost=0.00..1.28 rows=28 width=8) (actual time=0.013..0.130 rows=28 loops=1)
                                                                          ->  Index Scan using workflow_cases_pk on workflow_cases c  (cost=0.00..3.01 rows=1 width=4) (never executed)
                                                                                Index Cond: (c.case_id = "outer".case_id)
                                                                    ->  Index Scan using wf_case_role_pty_map_pk on workflow_case_role_party_map wcrpm  (cost=0.00..5.60 rows=1 width=12) (never executed)
                                                                          Index Cond: ((wcrpm.case_id = "outer".case_id) AND ("outer".assigned_role = wcrpm.role_id))
                                                              ->  Index Scan using party_approved_member_map_pk on party_approved_member_map pamm  (cost=0.00..65.34 rows=17 width=8) (never executed)
                                                                    Index Cond: (pamm.party_id = "outer".party_id)
Total runtime: 16444.930 ms
(84 rows)

Collapse
Posted by Jade Rubick on
A couple of things I'd start with:

Replace acs_users_all with a join on persons and parties. That removes the join on the users table, which should give us a but measurable improvement, especially since there are a couple sequential scans on persons, parties, and users.

I'd like to look at the query to figure out what's going on with the subselects in the from statement. I've had to do that a couple times in the past, but maybe in this case it isn't necessary.

Collapse
Posted by Malte Sussdorff on
The site currently runs at http://sussdorff.dyndns.org:8000/. As I need to do it anyway at one point in time, I will copy the whole code along with the database snapshot over to samoyed, so people who have access to that machine can go ahead and play around.
I installed AOLserver and Postgres on samoyed. Furthermore there is a file at /var/lib/aolserver called samoyed.tgz, which contains the code and the database file(s) that make up my current installation of openacs.org. As it takes some time to upload, expect it to be available at 8pm GMT. If you are keen to work on it immediately, just unpack it and load the file upgrade-3.dmp into postgresql (user upgrade-oo-5 and database upgrade-oo-5 already created). Good luck with tuning and thanks for taking this on Jade.
URL changed. check out http://samoyed.furfly.net :)
Collapse
Posted by Andrew Piskorski on
The SourceForge bug tracker sucks fairly badly; I don't particularly recommend that OpenACS use it.

It would also be really annoying (and rather embarassing to the project) to see all the old bug data in the current bug tracker simple abandoned. Nearly as bad would be to keep the "old" openacs.org site running solely in order to access the old bug tracker. OpenACS already did both those things once when converting openacs.org from OpenACS 3.x to 4.x - just plain ugly.

I would like to hear Lars' opinion of Bug Tracker, as well as anyone else who is familiar with the Bug Tracker internals. Is Bug Tracker basically fine and just needs some query tuning? Does it suffer from serious internal design problems? Is it too complicated? Or what?

Since Malte already figured out how to upgrade the openacs.org Bug Tracker, tuning the queries seems fairly likely to be both the least effort and best result path. Major rewrites of Bug Tracker are probably premature until someone at least tries seriously to figure out why those queries suddenly suck so badly, despite being ok on the old (current) openacs.org site.