Forum OpenACS Q&A: longdb 5 seconds query in assessment portlet

Warning: db_exec: longdb 5 seconds nsdb0 select dbqd.assessment-portlet.www.assessment-portlet.open_asssessments

Any idea how I could improve the above query? It takes ages:

select * from (select cr.item_id as assessment_id, cr.title, cr.description, a.password,
a.type,
to_char(a.start_time, 'YYYY-MM-DD HH24:MI:SS') as start_time,
to_char(a.end_time, 'YYYY-MM-DD HH24:MI:SS') as end_time,
to_char(now(), 'YYYY-MM-DD HH24:MI:SS') as cur_time,
cf.package_id, p.instance_name as community_name,
sc.node_id as comm_node_id, sa.node_id as as_node_id, a.anonymous_p,
acs_permission__permission_p(a.assessment_id,'12345','admin') as admin_p,
(select count(*) from as_sessions s1,
cr_revisions cr1 where
s1.assessment_id=cr1.revision_id
and cr1.item_id=cr.item_id
and s1.subject_id='12345'
and completed_datetime is null) as in_progress_p,
(select count(*) from as_sessions s1,
cr_revisions cr1 where
s1.assessment_id=cr1.revision_id
and cr1.item_id=cr.item_id
and s1.subject_id='12345'
and completed_datetime is not null) as completed_p,
a.number_tries

from as_assessments a, cr_revisions cr, cr_items ci, cr_folders cf,
site_nodes sa, site_nodes sc, apm_packages p,
(select distinct asm.assessment_id
from as_assessment_section_map asm, as_item_section_map ism
where ism.section_id = asm.section_id
and exists (select 1 from acs_object_party_privilege_map ppm
where ppm.object_id = asm.assessment_id
and ppm.privilege = 'read'
and ppm.party_id = '12345')) s
where a.assessment_id = cr.revision_id
and cr.revision_id = ci.latest_revision
and ci.parent_id = cf.folder_id
and ci.publish_status = 'live'
and cf.package_id in (6684174, 6508603, 6339314)
and sa.object_id = cf.package_id
and sc.node_id = sa.parent_id
and p.package_id = sc.object_id
and s.assessment_id = a.assessment_id
and ((a.start_time < current_timestamp and a.end_time > current_timestamp) or a.start_time is null)
order by lower(p.instance_name), lower(cr.title)
) q where (q.completed_p < q.number_tries) or (q.number_tries=0 or q.number_tries is null)

Collapse
Posted by Dave Bauer on
You need to analyze the query first of all.

use

select explain analyze .... query ...
in psql.

Read the extensive advice here.
https://openacs.org/xowiki/postgresql-admin

If this is not the only slow query you have, but other complex queries are also slow, you might need to tune PostgreSQL itself as well. There is additional advice on that page for that, but first analyze the query.

I know in older versions of assessment there were missing indexes but I don't know if they affected this query. You might check the latest code in CVS if you are not running dotlrn 2.3.1

Collapse
Posted by Nima Mazloumi on
I am running assessment 0.22d4, assessment portlet 2.3.0 and dotlrn 2.3.0.

Here the result of analyze:

"QUERY PLAN"
"Subquery Scan q (cost=37698.83..37698.85 rows=1 width=1939) (actual time=6706.558..6706.560 rows=1 loops=1)"
" -> Sort (cost=37698.83..37698.84 rows=1 width=355) (actual time=6706.554..6706.554 rows=1 loops=1)"
" Sort Key: lower((p.instance_name)::text), lower((cr.title)::text)"
" -> Nested Loop (cost=36891.59..37698.82 rows=1 width=355) (actual time=6705.466..6706.465 rows=1 loops=1)"
" Join Filter: (((subplan) < ""outer"".number_tries) OR (""outer"".number_tries = 0) OR (""outer"".number_tries IS NULL))"
" -> Nested Loop (cost=36891.59..37673.83 rows=1 width=290) (actual time=6695.675..6696.668 rows=1 loops=1)"
" -> Nested Loop (cost=36891.59..37670.72 rows=1 width=279) (actual time=6695.638..6696.628 rows=1 loops=1)"
" -> Nested Loop (cost=36891.59..37667.69 rows=1 width=275) (actual time=6695.601..6696.586 rows=1 loops=1)"
" -> Nested Loop (cost=36891.59..37662.53 rows=1 width=267) (actual time=6695.551..6696.530 rows=1 loops=1)"
" -> Nested Loop (cost=36891.59..37658.98 rows=1 width=267) (actual time=6695.513..6696.426 rows=4 loops=1)"
" -> Nested Loop (cost=36891.59..37655.80 rows=1 width=12) (actual time=6695.480..6696.346 rows=4 loops=1)"
" -> Unique (cost=36891.59..36905.64 rows=180 width=4) (actual time=6689.629..6693.900 rows=349 loops=1)"
" -> Sort (cost=36891.59..36898.62 rows=2810 width=4) (actual time=6689.625..6691.426 rows=4997 loops=1)"
" Sort Key: asm.assessment_id"
" -> Merge Join (cost=0.00..36730.63 rows=2810 width=4) (actual time=17.690..6682.865 rows=4997 loops=1)"
" Merge Cond: (""outer"".section_id = ""inner"".section_id)"
" -> Index Scan using as_assessment_section_map_pk2 on as_assessment_section_map asm (cost=0.00..36598.65 rows=370 width=8) (actual time=9.148..6670.322 rows=740 loops=1)"
" Filter: (subplan)"
" SubPlan"
" -> Hash Join (cost=5.16..3103.83 rows=70 width=0) (actual time=9.010..9.010 rows=1 loops=740)"
" Hash Cond: ((""outer"".privilege)::text = (""inner"".privilege)::text)"
" -> Nested Loop (cost=0.00..3095.67 rows=230 width=11) (actual time=9.007..9.007 rows=1 loops=740)"
" Join Filter: (""outer"".party_id = ""inner"".grantee_id)"
" -> Index Scan using party_member_member_idx on party_approved_member_map pamm (cost=0.00..87.10 rows=23 width=4) (actual time=0.008..0.152 rows=82 loops=740)"
" Index Cond: (member_id = 12345)"
" -> Nested Loop (cost=0.00..106.48 rows=1946 width=15) (actual time=0.013..0.098 rows=17 loops=60680)"
" -> Index Scan using acs_object_context_index_pk on acs_object_context_index c (cost=0.00..24.71 rows=11 width=4) (actual time=0.006..0.014 rows=7 loops=60680)"
" Index Cond: (object_id = $1)"
" -> Index Scan using acs_permissions_object_id_idx on acs_permissions p (cost=0.00..7.33 rows=8 width=19) (actual time=0.006..0.009 rows=3 loops=420320)"
" Index Cond: (""outer"".ancestor_id = p.object_id)"
" -> Hash (cost=5.15..5.15 rows=3 width=14) (actual time=0.089..0.089 rows=2 loops=1)"
" -> Bitmap Heap Scan on acs_privilege_descendant_map pdm (cost=2.01..5.15 rows=3 width=14) (actual time=0.073..0.074 rows=2 loops=1)"
" Recheck Cond: ((descendant)::text = 'read'::text)"
" -> Bitmap Index Scan on acs_priv_desc_map_idx (cost=0.00..2.01 rows=3 width=0) (actual time=0.058..0.058 rows=2 loops=1)"
" Index Cond: ((descendant)::text = 'read'::text)"
" -> Index Scan using as_item_section_map_sort_order_idx on as_item_section_map ism (cost=0.00..90.23 rows=2643 width=4) (actual time=0.037..5.744 rows=5007 loops=1)"
" -> Index Scan using cr_items_by_latest_revision on cr_items ci (cost=0.00..4.15 rows=1 width=8) (actual time=0.006..0.006 rows=0 loops=349)"
" Index Cond: (""outer"".assessment_id = ci.latest_revision)"
" Filter: ((publish_status)::text = 'live'::text)"
" -> Index Scan using as_assessments_assessment_id_pk on as_assessments a (cost=0.00..3.17 rows=1 width=255) (actual time=0.015..0.016 rows=1 loops=4)"
" Index Cond: (""outer"".assessment_id = a.assessment_id)"
" Filter: (((start_time < now()) AND (end_time > now())) OR (start_time IS NULL))"
" -> Index Scan using cr_folders_pk on cr_folders cf (cost=0.00..3.54 rows=1 width=8) (actual time=0.023..0.024 rows=0 loops=4)"
" Index Cond: (""outer"".parent_id = cf.folder_id)"
" Filter: ((package_id = 6684174) OR (package_id = 6508603) OR (package_id = 6339314))"
" -> Index Scan using site_nodes_object_id_idx on site_nodes sa (cost=0.00..5.12 rows=3 width=12) (actual time=0.044..0.049 rows=1 loops=1)"
" Index Cond: (sa.object_id = ""outer"".package_id)"
" -> Index Scan using site_nodes_node_id_pk on site_nodes sc (cost=0.00..3.03 rows=1 width=8) (actual time=0.033..0.035 rows=1 loops=1)"
" Index Cond: (sc.node_id = ""outer"".parent_id)"
" -> Index Scan using apm_packages_pack_id_pk on apm_packages p (cost=0.00..3.09 rows=1 width=19) (actual time=0.032..0.034 rows=1 loops=1)"
" Index Cond: (p.package_id = ""outer"".object_id)"
" -> Index Scan using cr_revisions_pk on cr_revisions cr (cost=0.00..3.23 rows=1 width=77) (actual time=0.034..0.037 rows=1 loops=1)"
" Index Cond: (""outer"".assessment_id = cr.revision_id)"
" SubPlan"
" -> Aggregate (cost=7.23..7.24 rows=1 width=0) (actual time=0.113..0.113 rows=1 loops=1)"
" -> Nested Loop (cost=0.00..7.23 rows=1 width=0) (actual time=0.094..0.094 rows=0 loops=1)"
" -> Seq Scan on as_sessions s1 (cost=0.00..3.99 rows=1 width=4) (actual time=0.093..0.093 rows=0 loops=1)"
" Filter: ((subject_id = 12345) AND (completed_datetime IS NOT NULL))"
" -> Index Scan using cr_revisions_pk on cr_revisions cr1 (cost=0.00..3.23 rows=1 width=4) (never executed)"
" Index Cond: (""outer"".assessment_id = cr1.revision_id)"
" Filter: (item_id = $0)"
" -> Aggregate (cost=7.23..7.24 rows=1 width=0) (actual time=0.085..0.086 rows=1 loops=1)"
" -> Nested Loop (cost=0.00..7.23 rows=1 width=0) (actual time=0.080..0.080 rows=0 loops=1)"
" -> Seq Scan on as_sessions s1 (cost=0.00..3.99 rows=1 width=4) (actual time=0.079..0.079 rows=0 loops=1)"
" Filter: ((subject_id = 12345) AND (completed_datetime IS NOT NULL))"
" -> Index Scan using cr_revisions_pk on cr_revisions cr1 (cost=0.00..3.23 rows=1 width=4) (never executed)"
" Index Cond: (""outer"".assessment_id = cr1.revision_id)"
" Filter: (item_id = $0)"
" -> Aggregate (cost=7.23..7.24 rows=1 width=0) (actual time=0.092..0.092 rows=1 loops=1)"
" -> Nested Loop (cost=0.00..7.23 rows=1 width=0) (actual time=0.080..0.080 rows=0 loops=1)"
" -> Seq Scan on as_sessions s1 (cost=0.00..3.99 rows=1 width=4) (actual time=0.080..0.080 rows=0 loops=1)"
" Filter: ((subject_id = 12345) AND (completed_datetime IS NULL))"
" -> Index Scan using cr_revisions_pk on cr_revisions cr1 (cost=0.00..3.23 rows=1 width=4) (never executed)"
" Index Cond: (""outer"".assessment_id = cr1.revision_id)"
" Filter: (item_id = $0)"
"Total runtime: 6707.835 ms"

Collapse
Posted by Tom Jackson on
Eeek! Is this one of those constructed queries? It would sure be nice if folks treated queries like function calls and didn't paste them together.

You should look into creating a number of views first. Views are compiled so you skip that cost, and you can easily test subqueries.

I can see that you are ordering by stuff which can't have an index, like lower(p.instance_name). This is never going to be fast.

But, the time something takes doesn't necessarily mean the query needs improvement, if it doesn't run very often, who cares?

Collapse
Posted by Nima Mazloumi on
This is in the assesement portlet. thus it runs on the startpage of every user who has a community that provides an assessment. My guess is that it will become worse with every assessment that is created.
Collapse
Posted by Nima Mazloumi on
Most time consuming is the priviledge check:

and exists (select 1 from acs_object_party_privilege_map ppm
where ppm.object_id = asm.assessment_id
and ppm.privilege = 'read'
and ppm.party_id = '1234')

Taking that out the query will only take 26ms. I don't see any reason why the permission (admin_p) and privilege checks need to be included in the query. Checking in tcl is by far less consuming due to caching.

Any objections going down that alley?

Collapse
Posted by Nima Mazloumi on
works like a charm now.
Collapse
Posted by Don Baccus on
I can see that you are ordering by stuff which can't have an index, like lower(p.instance_name). This is never going to be fast.
Not true at all. PG has supported functional indexes forever. It used to be an Oracle Enterprise feature but I believe that now it's even in Oracle Express.

That doesn't mean that the right index actually exists, of course, but it's certainly possible to make one. Just say "lower(column_name)" in the create index statement where you'd normally say "column_name".

Collapse
Posted by Don Baccus on
Permission caching in Tcl isn't safe because there's no comprehensive cache clearing when a permission is changed.

Also, "Checking in tcl is by far less consuming due to caching" is not, in general, a true statement. Most sites will run with permission caching turned off due to the problems caching can cause. And if a query returns (for example) 10 rows with the permission check, and 1000 rows without the permission check, doing 1000 permission checks in Tcl, one at a time, after the query, is NOT likely to be faster even IF you have caching turned on. And if you don't, doing 1000 additional queries isn't going to be fast.

Of course a lot of queries don't fit that profile, but many do, so you should be aware that your general statement about tcl checking being faster is simply wrong.

The "exists" permission check is the recommended way to do it and typically is very fast, and indeed is used throughout the toolkit. My guess is that it's blowing out the query because of the way it's being used, i.e. in a subquery.

Do what you want locally, but I don't consider moving the checking to tcl to be something we'd want committed to our CVS. Someone who uses assessment ought to work on that query, though, I'm sure it can be sped up considerably.

Collapse
Posted by Tom Jackson on
That doesn't mean that the right index actually exists, of course, but it's certainly possible to make one. Just say "lower(column_name)" in the create index statement where you'd normally say "column_name".

Yes, I made a pretty inexact statement, I assume that if someone wrote such a monster query without using views to simplify stuff, they probably also skipped creating any necessary indexes.

Of course the biggest problem is that it isn't obvious exactly what the query is intended to do, making it hard to change.

Like you said, does the query return 1000 rows without the permission check? Who knows, but I doubt that substituting Tcl code for something that can be done in the database will be faster. It could easily be orders of magnitude slower.

Is there any performance effect if you use count(*) as opposed to something like count(object_id)?

Hi!

We have modified the select open_assessments getting very better performance.
We had 15 seconds, now we get 2 seconds.

Regards,
Agustin
-------------------------------------------------


select oa.*,
acs_permission__permission_p(oa.assessment_id,:user_id,'admin') as admin_p,
acs_permission__permission_p(oa.assessment_id,:user_id,'read') as read_p
from
(select * from (select cr.item_id as assessment_id, cr.title, cr.description, a.password,a.type,
to_char(a.start_time, 'YYYY-MM-DD HH24:MI:SS') as start_time,
to_char(a.end_time, 'YYYY-MM-DD HH24:MI:SS') as end_time,
to_char(now(), 'YYYY-MM-DD HH24:MI:SS') as cur_time,
cf.package_id, p.instance_name as community_name,
sc.node_id as comm_node_id, sa.node_id as as_node_id, a.anonymous_p,
(select count(*) from as_sessions s1,
cr_revisions cr1 where
s1.assessment_id=cr1.revision_id
and cr1.item_id=cr.item_id
and s1.subject_id=:user_id
and completed_datetime is null) as in_progress_p,
(select count(*) from as_sessions s1,
cr_revisions cr1 where
s1.assessment_id=cr1.revision_id
and cr1.item_id=cr.item_id
and s1.subject_id=:user_id
and completed_datetime is not null) as completed_p,
a.number_tries

from as_assessments a, cr_revisions cr, cr_items ci, cr_folders cf,
site_nodes sa, site_nodes sc, apm_packages p
where a.assessment_id = cr.revision_id
and cr.revision_id = ci.latest_revision
and ci.parent_id = cf.folder_id
and ci.publish_status = 'live'
and cf.package_id in ([join $list_of_package_ids ", "])
and sa.object_id = cf.package_id
and sc.node_id = sa.parent_id
and p.package_id = sc.object_id
and a.assessment_id in (select distinct asm.assessment_id from as_assessment_section_map asm, as_item_section_map ism
where ism.section_id = asm.section_id)
and ((a.start_time < current_timestamp and a.end_time > current_timestamp) or a.start_time is null or a.end_time is null)
order by lower(p.instance_name), lower(cr.title)) q
where ((q.completed_p < q.number_tries) or (q.number_tries=0 or q.number_tries is null))) AS oa
where acs_permission__permission_p(oa.assessment_id,:user_id,'read') = 't'