Forum OpenACS Development: calendar in .LRN really slow

Collapse
Posted by Bernt Pettersen on
I really can't figure this one out:

Whenever someone logs into My Space, and the day-summary is to be generated, it takes like 12(!) seconds...

The user has 5 communities and 5 classes.

[04/Jul/2003:13:01:51][936.5126][-conn3-] Notice: SQL():
        select to_char(start_date, 'YYYY-MM-DD HH24:MI:SS') as ansi_start_date,
         to_char(end_date, 'YYYY-MM-DD HH24:MI:SS') as ansi_end_date,
         nvl(e.name, a.name) as name,
         nvl(e.status_summary, a.status_summary) as status_summary,
         e.event_id as item_id,
         (select type from cal_item_types where item_type_id= cal_items.item_type_id) as item_type,
         on_which_calendar as calendar_id,
         (select calendar_name from calendars
         where calendar_id = on_which_calendar)
         as calendar_name,
         to_char(timezone.local_to_utc(timezone.get_id(:timezone),to_date(:current_date,:date_format)),'YYYY-MM-DD HH24:MI:SS') as start_interval,
         to_char(timezone.local_to_utc(timezone.get_id(:timezone),to_date(:current_date,:date_format) + (24 - 1/3600)/24),'YYYY-MM-DD HH24:MI:SS')  as end_interval
from     acs_activities a,
         acs_events e,
         timespans s,
         time_intervals t,
         cal_items
where    e.timespan_id = s.timespan_id
and      s.interval_id = t.interval_id
and      e.activity_id = a.activity_id
and      start_date between
         timezone.local_to_utc(timezone.get_id(:timezone),to_date(:current_date,:date_format)) and
         timezone.local_to_utc(timezone.get_id(:timezone),to_date(:current_date,:date_format) + (24 - 1/3600)/24)
and     cal_items.cal_item_id= e.event_id
and      e.event_id
in       (
         select  cal_item_id
         from    cal_items
         where   on_which_calendar in (421755,18243,18478,981405,1638428)
         )


[04/Jul/2003:13:01:51][936.5126][-conn3-] Notice: bind variable 'timezone' = 'Europe/Oslo'
[04/Jul/2003:13:01:51][936.5126][-conn3-] Notice: bind variable 'current_date' = '2003-07-04'
[04/Jul/2003:13:01:51][936.5126][-conn3-] Notice: bind variable 'date_format' = 'YYYY-MM-DD HH24:MI'
[04/Jul/2003:13:01:51][936.5126][-conn3-] Notice: bind variable 'timezone' = 'Europe/Oslo'
[04/Jul/2003:13:01:51][936.5126][-conn3-] Notice: bind variable 'current_date' = '2003-07-04'
[04/Jul/2003:13:01:51][936.5126][-conn3-] Notice: bind variable 'date_format' = 'YYYY-MM-DD HH24:MI'
[04/Jul/2003:13:01:51][936.5126][-conn3-] Notice: bind variable 'timezone' = 'Europe/Oslo'
[04/Jul/2003:13:01:51][936.5126][-conn3-] Notice: bind variable 'current_date' = '2003-07-04'
[04/Jul/2003:13:01:51][936.5126][-conn3-] Notice: bind variable 'date_format' = 'YYYY-MM-DD HH24:MI'
[04/Jul/2003:13:01:51][936.5126][-conn3-] Notice: bind variable 'timezone' = 'Europe/Oslo'
[04/Jul/2003:13:01:51][936.5126][-conn3-] Notice: bind variable 'current_date' = '2003-07-04'
[04/Jul/2003:13:01:51][936.5126][-conn3-] Notice: bind variable 'date_format' = 'YYYY-MM-DD HH24:MI'
[04/Jul/2003:13:02:03][936.5126][-conn3-] Notice: SQL():
select   to_char(to_date(:current_date, 'yyyy-mm-dd')-1, 'yyyy-mm-dd')
as yesterday,
to_char(to_date(:current_date, 'yyyy-mm-dd')+1, 'yyyy-mm-dd')
as tomorrow
from     dual

[04/Jul/2003:13:02:03][936.5126][-conn3-] Notice: bind variable 'current_date' = '2003-07-04'
[04/Jul/2003:13:02:03][936.5126][-conn3-] Notice: bind variable 'current_date' = '2003-07-04'

The last select statement generates like 150 cal_items;

         select  cal_item_id
         from    cal_items
         where   on_which_calendar in (421755,18243,18478,981405,1638428)

Any clues for where to look to improve performance?

(Oracle 8.1.7, calendar 1.0, acs-kernel 4.7.2d)

Collapse
Posted by Jarkko Laine on
Hei Bernt,

Are you able to use ACS Developer Support to find out what is hogging the most time when loading that page? You have to set DatabaseEnabledP parameter to 1 in acs-developer-supports's parameters in order to get the db query information.

I think the package is very useful when finding bottlenecks of page requests like that.

Collapse
Posted by Bernt Pettersen on
Well, I installed ACS Dev Support.

And it only confirms my qualified guessing;

Request Processor

  • +1.6 ms: Applied transformation from /web/uib-staging/www / dotlrn/ -> ? - 1.6 ms
  • +5.2 ms: Served file /web/uib-staging/packages/dotlrn/www/index.adp with adp_parse_ad_conn_file - 12357.8 ms
  • +12363.9 ms: Applied GET filter: (for /dotlrn/ ds_trace_filter) - 1.6 ms
    • returned filter_ok

All but this db-request are 0-8 ms...

11621 ms  	  pool2  	dbqd.calendar.tcl.calendar-display-procs.calendar::one_day_display.select_day_items: select nsdb0

	select to_char(start_date, 'YYYY-MM-DD HH24:MI:SS') as ansi_start_date,
         to_char(end_date, 'YYYY-MM-DD HH24:MI:SS') as ansi_end_date,
         nvl(e.name, a.name) as name,
         nvl(e.status_summary, a.status_summary) as status_summary,
         e.event_id as item_id,
         (select type from cal_item_types where item_type_id= cal_items.item_type_id) as item_type,
	 on_which_calendar as calendar_id,
	 (select calendar_name from calendars 
	 where calendar_id = on_which_calendar)
	 as calendar_name,
         to_char(timezone.local_to_utc(timezone.get_id(:timezone),to_date(:current_date,:date_format)),'YYYY-MM-DD HH24:MI:SS') as start_interval,
         to_char(timezone.local_to_utc(timezone.get_id(:timezone),to_date(:current_date,:date_format) + (24 - 1/3600)/24),'YYYY-MM-DD HH24:MI:SS')  as end_interval
from     acs_activities a,
         acs_events e,
         timespans s,
         time_intervals t,
         cal_items
where    e.timespan_id = s.timespan_id
and      s.interval_id = t.interval_id
and      e.activity_id = a.activity_id
and      start_date between
         timezone.local_to_utc(timezone.get_id(:timezone),to_date(:current_date,:date_format)) and
         timezone.local_to_utc(timezone.get_id(:timezone),to_date(:current_date,:date_format) + (24 - 1/3600)/24)
and     cal_items.cal_item_id= e.event_id
and      e.event_id
in       (
         select  cal_item_id
         from    cal_items
         where   on_which_calendar in (1586,18243,18478,981405,983051,1141981,983359,878911,392985,1032333,139824,1301790,890307,3050564,1235418)
         )
	
Collapse
Posted by Dirk Gomez on

This query is awful! It joins five tables, has a couple of subqueries and hardcoded values so that the query needs to hard-parsed on very call.

That said: I was foolish enough to pick up calendar maintainership :) I'm currently haggling with the tcl procs that generate html.

The datamodel is quite broken, so are the queries. (Read the comments, they are fairly funny). It'll take some time for me to correct this though.

If you are willing to help me, my first question would be: Wtf is time_spans and time_intervals and why is being used in this d?mned calendar?

Collapse
Posted by Don Baccus on
Yes, the datamodel sucks but it shouldn't run that slow, and it doesn't on SloanSpace.

Which makes me wonder ... are you sure that you've told Oracle to gather statistics on your tables in order to raise odds that it's generated a decent execution plan?

Collapse
Posted by Bernt Pettersen on
Dirk:

I am not really sure what timaspan and time_interval really is all about, but they're defined in:

-- packages/acs-events/sql/timespan-create.sql
-- This script defines the data models for both time_interval and timespan.

Our tables have quite a few of those:

SQL> SELECT COUNT(*) FROM timespans;

  COUNT(*)
----------
     12681

SQL> SELECT COUNT(*) FROM time_intervals;

  COUNT(*)
----------
     25362

And I'm pretty sure Oracle will slow down due to parsing of the query, instead of bind'ing the variables. But that couldn't be the major problem here.

Don:

Yes, I'm sure it doesn't run this slow on SloanSpace. 😊

I'll talk to our dba on monday, try to figure out if there's problems with our Oracle.

I tried myself yesterday to use some Oracle tools to get an explain plan. Seemed getting 3 (full) tables took 64 out of a total of 80 costs. But I don't remember the details, and I'm most certainly no dba. :-\

Collapse
Posted by Don Baccus on
Make sure that statistics gathering is happening (install is supposed to set this up automatically but it doesn't hurt to double-check)

Make sure that shared pool size and various other Oracle settings are set large enough (your dba should be able to help you here)  You should be seeing essentially zero disk I/O for read-only queries like this.

We know we need to work towards more effective caching of stuff like this but it's going to take time ...

Dirk: the timezone calculations shouldn't be done in the query because it's probably being calculated for every bleepin' row, knowing how brain-dead Oracle can be.  Since they're calculated on bindvars they're constant values for the duration of the query ...

Collapse
Posted by Jon Griffin on
acs-events is a generalized package. Timespans and timeintervals are an integral part of calendars and events.

Basically (IIRC, when this was specd), an activity is generated, it has neither time or date (i.e. walk in the park), it becomes an event when a time and/or date is created. An event can have a start date/time and no end time and vice-versa (i.e. walk 3 miles on Sat at 4PM).

Events can also span more than one timespan and/or interval. acs-events is not broken and calendar correctly uses it's services. The rest of the calendar datamodel is as you say in need of work (there are some other threads regarding this.)

Collapse
Posted by Bernt Pettersen on
Well, our tuning didn't help
- Shared Global Area is now 1GB
- Cache Hit Ratio is 99,94%
- Statistics gathering is on

Still, the day-summary takes more than 10 seconds.

Do you think it will help to "reindex" the indexes?
Or maybe export/import the whole DB?

- The ACS/dotLRN DB currently uses 930 MB

Collapse
Posted by Dirk Gomez on

Can you post the plan then. Also please post a count of the tables, so that we can guess where most time gets lost.

Here's a doc that describes what you have to do: Autotrace in SQLPLUS

Collapse
Posted by Bernt Pettersen on
Ah! Thanks for the link.

Here's the table sizes

TABLE_NAME                       NUM_ROWS
------------------------------ ----------
ACS_ACTIVITIES                       2946
TIME_INTERVALS                      29110
TIMESPANS                           14555
ACS_EVENTS                          11736
TIMEZONES                              52
CALENDARS                           28858
CAL_ITEM_TYPES                         43
CAL_ITEMS                           11736

And the execution plan:

no rows selected


Execution Plan
----------------------------------------------------------
   0      SELECT STATEMENT Optimizer=CHOOSE (Cost=79 Card=1 Bytes=136)
   1    0   NESTED LOOPS (Cost=79 Card=1 Bytes=136)
   2    1     NESTED LOOPS (Cost=78 Card=1 Bytes=125)
   3    2       HASH JOIN (Cost=77 Card=1 Bytes=56)
   4    3         HASH JOIN (Cost=66 Card=59 Bytes=2714)
   5    4           HASH JOIN (Cost=44 Card=73 Bytes=1898)
   6    5             TABLE ACCESS (FULL) OF 'TIME_INTERVALS' (Cost=33 Card=73 Bytes=1314)
   7    5             TABLE ACCESS (FULL) OF 'TIMESPANS' (Cost=10 Card=14555 Bytes=116440)
   8    4           TABLE ACCESS (FULL) OF 'ACS_EVENTS' (Cost=21 Card=11736 Bytes=234720)
   9    3         TABLE ACCESS (FULL) OF 'CAL_ITEMS' (Cost=10 Card=103 Bytes=1030)
  10    2       TABLE ACCESS (BY INDEX ROWID) OF 'ACS_ACTIVITIES' (Cost=1 Card=2946 Bytes=203274)
  11   10         INDEX (UNIQUE SCAN) OF 'ACS_ACTIVITIES_PK' (UNIQUE)
  12    1     TABLE ACCESS (BY INDEX ROWID) OF 'CAL_ITEMS' (Cost=1 Card=11736 Bytes=129096)
  13   12       INDEX (UNIQUE SCAN) OF 'CAL_ITEM_CAL_ITEM_ID_PK' (UNIQUE)


Statistics
----------------------------------------------------------
     114430  recursive calls
     228875  db block gets
     286537  consistent gets
          0  physical reads
          0  redo size
        716  bytes sent via SQL*Net to client
        319  bytes received via SQL*Net from client
          1  SQL*Net roundtrips to/from client
         22  sorts (memory)
          0  sorts (disk)
          0  rows processed
Collapse
Posted by Dirk Gomez on

Do you have indexes on time_intervals and timespans? The first thing that Oracle does while executing this statement is joining those two tables - with a full table scan.

It's a bit hard to say whether a full table scan here is inevitable because I don't understand the semantics of them. It could well be that Oracle needs to touch a big enough part of rows in these tables so that a full-table scan appears (is?) cheaper.

What happens if you remove both tables from query. Just performance-wise...we have to check the semantics a little later then :)

Collapse
Posted by Bernt Pettersen on
Without time_intervals and timespans the query uses almost no time... (dont know exactly)

I'll check for indexes in a while...

Execution Plan
----------------------------------------------------------
   0      SELECT STATEMENT Optimizer=CHOOSE (Cost=54 Card=147 Bytes=16
          317)

   1    0   HASH JOIN (Cost=54 Card=147 Bytes=16317)
   2    1     HASH JOIN (Cost=43 Card=150 Bytes=14850)
   3    2       HASH JOIN (Cost=32 Card=150 Bytes=4650)
   4    3         TABLE ACCESS (FULL) OF 'CAL_ITEMS' (Cost=10 Card=150
           Bytes=1800)

   5    3         TABLE ACCESS (FULL) OF 'ACS_EVENTS' (Cost=21 Card=11
          940 Bytes=226860)

   6    2       TABLE ACCESS (FULL) OF 'ACS_ACTIVITIES' (Cost=10 Card=
          3050 Bytes=207400)

   7    1     TABLE ACCESS (FULL) OF 'CAL_ITEMS' (Cost=10 Card=11670 B
          ytes=140040)





Statistics
----------------------------------------------------------
          4  recursive calls
         25  db block gets
        334  consistent gets
          0  physical reads
          0  redo size
        844  bytes sent via SQL*Net to client
        430  bytes received via SQL*Net from client
          2  SQL*Net roundtrips to/from client
          0  sorts (memory)
          0  sorts (disk)
          1  rows processed

Collapse
Posted by Dirk Gomez on

It now uses full-table scans for all tables - which it shouldn't. At least the table that contains the field start_date should be accessed via an index because you want only a *very few* rows from that table. Looks quite weird.

Anyway the join between time_intervals and timespans is gigantic (30.000x15.000). Does someone know what they mean with respect to calendar?

Collapse
Posted by Bernt Pettersen on
YES!

I turned on indexes on time_intervals.start_date and time_intervals.end_date.

Now it runs like h*ll! 😊

no rows selected


Execution Plan
----------------------------------------------------------
   0      SELECT STATEMENT Optimizer=CHOOSE (Cost=52 Card=1 Bytes=145)
   1    0   NESTED LOOPS (Cost=52 Card=1 Bytes=145)
   2    1     NESTED LOOPS (Cost=51 Card=1 Bytes=133)
   3    2       HASH JOIN (Cost=50 Card=1 Bytes=65)
   4    3         HASH JOIN (Cost=39 Card=60 Bytes=3180)
   5    4           HASH JOIN (Cost=17 Card=73 Bytes=2117)
   6    5             TABLE ACCESS (BY INDEX ROWID) OF 'TIME_INTERVALS
          ' (Cost=6 Card=73 Bytes=1460)

   7    6               INDEX (RANGE SCAN) OF 'TIME_INTERVALS_START_DA
          TE_IDX' (NON-UNIQUE) (Cost=2 Card=73)

   8    5             TABLE ACCESS (FULL) OF 'TIMESPANS' (Cost=10 Card
          =14460 Bytes=130140)

   9    4           TABLE ACCESS (FULL) OF 'ACS_EVENTS' (Cost=21 Card=
          11940 Bytes=286560)

  10    3         TABLE ACCESS (FULL) OF 'CAL_ITEMS' (Cost=10 Card=150
           Bytes=1800)

  11    2       TABLE ACCESS (BY INDEX ROWID) OF 'ACS_ACTIVITIES' (Cos
          t=1 Card=3050 Bytes=207400)

  12   11         INDEX (UNIQUE SCAN) OF 'ACS_ACTIVITIES_PK' (UNIQUE)
  13    1     TABLE ACCESS (BY INDEX ROWID) OF 'CAL_ITEMS' (Cost=1 Car
          d=11670 Bytes=140040)

  14   13       INDEX (UNIQUE SCAN) OF 'CAL_ITEM_CAL_ITEM_ID_PK' (UNIQ
          UE)





Statistics
----------------------------------------------------------
         12  recursive calls
         21  db block gets
        300  consistent gets
          1  physical reads
          0  redo size
        716  bytes sent via SQL*Net to client
        319  bytes received via SQL*Net from client
          1  SQL*Net roundtrips to/from client
         20  sorts (memory)
          0  sorts (disk)
          0  rows processed
Collapse
Posted by Don Baccus on
I've added an index on the start_date to CVS HEAD, that should take care of the problem.  I'm still surprised at the poor quality of the plan you were seeing earlier ...