Forum OpenACS Q&A: Slow SubSelect in BBoard Forum
Sulu: Warp 7 sir?
Kirk: It's too damn slow!...
-- Lyrics From: Star Trek Rhapsody by Weird Al
Here is the specific code that I use with my mods to the BBoard system in openacs3. What this does, is eliminate the "New Answers" page. This page is unreliable when using more than one machine to access the forums probably due to cookie changes. Also, the page often hides threads that I just didn't get around to read when I come back later.
My solution, implemented long ago, was to have just one page which shows the threads with the most recent listed first along with the name of the most recent poster.
Here is the code that I use. I have simplified it and removed the view so it is clear. The problem is that it can take 5 to 10 SECONDS to execute on our current bboard with about 18,000 messages.
The really slow part is the subselect: msg_id in (select... shown in red below. The subselect is used to get the most recent message in each thread. Every message in a thread has the same root_message_id and as well as the leading part of the sort_key (ex: 0004FF:01)
select now() from dual; SELECT bb.msg_id, bb.root_msg_id, bb.refers_to, bb.topic_id, bb.user_id AS poster_id, bb.one_line, bb.message, bb.html_p, bb.posting_time, bb.sort_key, bt.topic, u.first_names || ' ' || u.last_name AS name, u.email FROM bboard bb, bboard_topics bt, users u WHERE bb.user_id = u.user_id AND bb.topic_id = bt.topic_id AND bb.dele_ted ISNULL AND bt.topic_id = 11 and date_num_days(sysdate() - bb.posting_time) < 90 and bb.msg_id in (select max(bb.msg_id) from bboard group by bb.root_msg_id) order by bb.posting_time desc limit 1; select now() from dual;
I even added a 2 column index:
CREATE INDEX bboard_root_msg_msg_idx
ON bboard (root_msg_id, msg_id);
But this seems to have little effect.
I've racked my brains on how to get the most recent row from variable sized group AS FAST AS POSSIBLE.
As I understand it, it is best to do the processing in the database before getting the results. I had considered getting the whole thread out of the database and in tcl throw away the unneeded messages.
Another idea is to build a helper table that inserts or overwrites the most recent message in a forum "topic" every time a post is made. Then, when I want to show the who and title of the most recent message in a thread, I just join bboard,users, bboard_topics with the new helper table.
Create table bboard_latest ( root_msg_id char(6) not null, msg_id char(6) not null ); DROP FUNCTION get_last_bbthread(char,char); CREATE FUNCTION get_last_bbthread(char(6), char(6)) RETURNS boolean AS 'DECLARE vroot_msg_id ALIAS FOR $1; vmsg_id ALIAS FOR $2; BEGIN IF length(vroot_msg_id) = 0 -- no root id, failure THEN RETURN ''f''; END IF; IF length(vmsg_id) = 0 -- no msg id, failure THEN RETURN ''f''; END IF; SELECT into temp x from bboard_latest WHERE root_msg_id = vroot_msg_id; IF NOT FOUND -- is root not in table? THEN INSERT INTO bboard_latest (root_msg_id,msg_id) VALUES (vroot_msg_id, vmsg_id); ELSE UPDATE bboard_lastest SET msg_id = vmsg_id WHERE root_msg_id = vroot_msg_id; END IF; RETURN ''t''; END; ' LANGUAGE 'plpgsql'; -- Try it select get_last_bbthread('000400','000401'); ERROR: parser: parse error at or near "x"
I have tried variations to get it right and I don't understand the necessity of "select into" vs just a select. I am further confused by the PG docs which say:
- Note: CREATE TABLE AS is functionally equivalent to SELECT INTO. CREATE TABLE AS is the recommended syntax, since SELECT INTO is not standard. In fact, this form of SELECT INTO is not available in PL/pgSQL or ecpg, because they interpret the INTO clause differently.
Ok, I would enjoy a bit of help and explaination.
Ok, Thank $your_deity for Google!
I found the solution with a couple of Google serches.
I had not defined "record".
So it is defined as rec and now works! Here's the good code:
DROP FUNCTION get_last_bbthread(char,char); CREATE FUNCTION get_last_bbthread(char(6), char(6)) RETURNS boolean AS 'DECLARE vroot_msg_id ALIAS FOR $1; vmsg_id ALIAS FOR $2; rec record; BEGIN IF length(vroot_msg_id) = 0 -- no root id, failure THEN RETURN ''f''; END IF; IF length(vmsg_id) = 0 -- no msg id, failure THEN RETURN ''f''; END IF; SELECT into rec * from bboard_latest WHERE root_msg_id = vroot_msg_id; IF NOT FOUND -- is root not in table? THEN INSERT INTO bboard_latest (root_msg_id,msg_id) VALUES (vroot_msg_id, vmsg_id); ELSE UPDATE bboard_latest SET msg_id = vmsg_id WHERE root_msg_id = vroot_msg_id; END IF; RETURN ''t''; END; ' LANGUAGE 'plpgsql';
In PL/pgSQL SELECT INTO means the same as it does in Oracle's PL/SQL (which PL/pgSQL is modelled after). The idea is that you define some variables, say "v_foo" and "v_bar" and then select into those variables like this:
Afterwards, v_foo and v_bar are available for use in your code. Grep around the OpenACS 4.5 sources for examples of this - there are tons of them.select t.foo, t.bar into v_foo, v_bar from table t where some_key = 'some_value';
OK, my above code works:
select get_last_bbthread('000400','000401'); get_last_bbthread ------------------- t (1 row)
Now, I'm onto creating a trigger but it seems not to see my function:
CREATE TRIGGER bb_latest BEFORE INSERT ON bbtest FOR EACH ROW EXECUTE PROCEDURE get_last_bbthread('root_msg_id','msg_id'); ERROR: CreateTrigger: function get_last_bbthread() does not exist
Please help me find what is missing or incorrect. TIA.
Note: bbtest is the same as bboard and will eventually be changed in production.
create function my_fun () returns opaque as ' declare -- vars here; begin -- code here; end;' language 'plpgsql'; create trigger my_trigger after insert on my_table for each row execute procedure my_fun ();
So, slowly I'm getting this.... Based on this, it looks like I need to revise my function. What I need to understand is how column values are passed. In my working function above, They were passed to the function.
What I want this trigger to do, is to get two column values from bboard after any insert and then insert or update these two values into my helper table.
DROP FUNCTION latest_thread(); CREATE FUNCTION latest_thread() RETURNS OPAQUE AS 'DECLARE rec record; BEGIN SELECT into rec * from bboard_latest WHERE root_msg_id = new.root_msg_id; IF NOT FOUND -- is root not in table? THEN INSERT INTO bboard_latest (root_msg_id,msg_id) VALUES (new.root_msg_id, new.msg_id); ELSE UPDATE bboard_latest SET msg_id = new.msg_id WHERE root_msg_id = new.root_msg_id; END IF; RETURN NEW; END; ' LANGUAGE 'plpgsql'; DROP TRIGGER latest_thread on bbtest; CREATE TRIGGER latest_thread AFTER INSERT ON bbtest FOR EACH ROW EXECUTE PROCEDURE latest_thread (); -- now test it select * from bboard_latest; select * from bbtest; insert into bbtest (root_msg_id,msg_id) values ('123456','123456');