Forum OpenACS Development: Suggestions for calendar scalability?

I'm starting to do some tuning on Sloan's upgraded OACS5.1 + .LRN 2.1 pre-release system.

One of the slower queries is dbqd.calendar.www.views.select_items. For the calendar "day view" it takes about 1.6 seconds. We've got roughly 36,000 rows in cal_items and > 10,000 in each of the six tables that are joined.

I've experimented a little bit with creating indexes but haven't really thought deeply about it yet.

If you happened to have tuned this query, or have ideas about how to tune it, I'd appreciate hearing about it.

Here's the query for reference:

select to_char(start_date, 'YYYY-MM-DD HH24:MI:SS') as ansi_start_date,
       to_char(end_date, 'YYYY-MM-DD HH24:MI:SS') as ansi_end_date,
       to_number(to_char(start_date,'HH24'),'90') as start_hour,
       to_number(to_char(end_date,'HH24'),'90') as end_hour,
       to_number(to_char(end_date,'MI'),'90') as end_minutes,
       nvl(e.name, a.name) as name,
       nvl(e.status_summary, a.status_summary) as status_summary,
       e.event_id as item_id,
       (select type from cal_item_types where item_type_id= ci.item_type_id) as item_type,
cals.calendar_id,
cals.calendar_name
  from acs_activities a,
       acs_events e,
       timespans s,
       time_intervals t,
       cal_items ci,
       calendars cals
  where    e.timespan_id = s.timespan_id
  and      s.interval_id = t.interval_id
  and      e.activity_id = a.activity_id
  and      start_date between 
  to_date(:current_date,'YYYY-MM-DD') 
  and (to_date(:current_date,'YYYY-MM-DD') + (24 - 1/3600)/24)

  and      ci.cal_item_id = e.event_id
  and      cals.calendar_id = ci.on_which_calendar
  and      e.event_id = ci.cal_item_id
   and (to_char(start_date, 'HH24:MI') <> '00:00' or to_char(end_date, 'HH24:MI') <> '00:00')
and on_which_calendar in (long list of integers) and (cals.private_p='f' or (cals.private_p='t' and cals.owner_id= :user_id))
   order by to_char(start_date,'HH24')
Collapse
Posted by Andrew Grumet on
The following two changes reduce the query time to 170ms:

1. Force the join order as follows

select /*+ ORDERED */
       column list
from cal_items ci,
     acs_events e,
     timespans s,
     time_intervals t,
     acs_activities a,
     calendars cals
For some reason Oracle wasn't picking the right order. I'm pretty sure that most queries start by restricting to particular calendars, stored in cal_items.on_which_calendar, so we will almost always want to start there. A notable exception are "all day" events, which due to an implementation quirk may find better selectivity in the start and end dates in the timespans table.

2. Create these indexes:

create index acs_events_cal_idx on acs_events (activity_id,timespan_id,event_id,name,status_summary);
create index cal_items_fk_idx on cal_items (on_which_calendar,cal_item_id,item_type_id);
These prevent full table scans by combining key columns from the WHERE and SELECT clauses of the query.

I'll fiddle with these a bit more and wait for any feedback before committing to OACS.

Collapse
Posted by Dirk Gomez on
Can you post two executions plan: one with the ORDERED hint, one without it. Then we may be able to see why Oracle goes wrong.

As to scalability of the data model: I have a half-finished version of calendar that doesn't depend on acs-events anymore (PG). There is so many tables joined in this query - it's probably impossible to make it scale nicely.

Collapse
Posted by Andrew Grumet on
Sure. Here are the plans without the indexes mentioned above:

Without /*+ ORDERED */ [SLOW]

Execution Plan
----------------------------------------------------------
   0      SELECT STATEMENT Optimizer=CHOOSE (Cost=41 Card=1 Bytes=208)
   1    0   SORT (ORDER BY) (Cost=41 Card=1 Bytes=208)
   2    1     NESTED LOOPS (Cost=37 Card=1 Bytes=208)
   3    2       NESTED LOOPS (Cost=36 Card=1 Bytes=188)
   4    3         NESTED LOOPS (Cost=34 Card=1 Bytes=179)
   5    4           NESTED LOOPS (Cost=33 Card=1 Bytes=61)
   6    5             HASH JOIN (Cost=32 Card=1 Bytes=35)
   7    6               TABLE ACCESS (FULL) OF 'CAL_ITEMS' (Cost=9 Car
          d=852 Bytes=10224)

   8    6               TABLE ACCESS (FULL) OF 'ACS_EVENTS' (Cost=22 C
          ard=36493 Bytes=839339)

   9    5             INLIST ITERATOR
  10    9               TABLE ACCESS (BY INDEX ROWID) OF 'CALENDARS' (
          Cost=1 Card=15 Bytes=390)

  11   10                 INDEX (UNIQUE SCAN) OF 'CALENDARS_CALENDAR_I
          D_PK' (UNIQUE)

  12    4           TABLE ACCESS (BY INDEX ROWID) OF 'ACS_ACTIVITIES'
          (Cost=1 Card=16707 Bytes=1971426)

  13   12             INDEX (UNIQUE SCAN) OF 'ACS_ACTIVITIES_PK' (UNIQ
          UE)

  14    3         TABLE ACCESS (BY INDEX ROWID) OF 'TIMESPANS' (Cost=2
           Card=41130 Bytes=370170)

  15   14           INDEX (RANGE SCAN) OF 'TIMESPANS_IDX' (NON-UNIQUE)
           (Cost=1 Card=41130)

  16    2       TABLE ACCESS (BY INDEX ROWID) OF 'TIME_INTERVALS' (Cos
          t=1 Card=206 Bytes=4120)

  17   16         INDEX (UNIQUE SCAN) OF 'TIME_INTERVALS_PK' (UNIQUE)
With /*+ ORDERED */ [FAST]
Execution Plan
----------------------------------------------------------
   0      SELECT STATEMENT Optimizer=CHOOSE (Cost=41 Card=1 Bytes=208)
   1    0   SORT (ORDER BY) (Cost=41 Card=1 Bytes=208)
   2    1     NESTED LOOPS (Cost=37 Card=1 Bytes=208)
   3    2       NESTED LOOPS (Cost=36 Card=1 Bytes=182)
   4    3         NESTED LOOPS (Cost=35 Card=1 Bytes=64)
   5    4           NESTED LOOPS (Cost=34 Card=1 Bytes=44)
   6    5             HASH JOIN (Cost=32 Card=1 Bytes=35)
   7    6               TABLE ACCESS (FULL) OF 'CAL_ITEMS' (Cost=9 Car
          d=852 Bytes=10224)

   8    6               TABLE ACCESS (FULL) OF 'ACS_EVENTS' (Cost=22 C
          ard=36493 Bytes=839339)

   9    5             TABLE ACCESS (BY INDEX ROWID) OF 'TIMESPANS' (Co
          st=2 Card=41130 Bytes=370170)

  10    9               INDEX (RANGE SCAN) OF 'TIMESPANS_IDX' (NON-UNI
          QUE) (Cost=1 Card=41130)

  11    4           TABLE ACCESS (BY INDEX ROWID) OF 'TIME_INTERVALS'
          (Cost=1 Card=206 Bytes=4120)

  12   11             INDEX (UNIQUE SCAN) OF 'TIME_INTERVALS_PK' (UNIQ
          UE)

  13    3         TABLE ACCESS (BY INDEX ROWID) OF 'ACS_ACTIVITIES' (C
          ost=1 Card=16707 Bytes=1971426)

  14   13           INDEX (UNIQUE SCAN) OF 'ACS_ACTIVITIES_PK' (UNIQUE
          )

  15    2       INLIST ITERATOR
  16   15         TABLE ACCESS (BY INDEX ROWID) OF 'CALENDARS' (Cost=1
           Card=15 Bytes=390)

  17   16           INDEX (UNIQUE SCAN) OF 'CALENDARS_CALENDAR_ID_PK'
          (UNIQUE)