Forum OpenACS Q&A: Creating aggregate functions in Postgres

Hi all,

I have this view that I am porting that is a UNION of two self OUTER JOINed views. PG does not suppot UNIONs in views, so we thought of creating our own aggregate functions to do the job.

There are no examples for CREATE AGGREGATE in the docs, but we made it work, at least partially, but it explodes a little after the first row column.

Here's the original Oracle query:

create or replace view cs_n_sessions_day_user as
select b.date_id, b.n_sessions_day, nvl(a.n_users, 0) as members,
                  b.n_users as non_members
from cs_historical_visits_grouped a,cs_historical_visits_grouped b
where  b.n_sessions_day = a.n_sessions_day(+)
                  and b.date_id = a.date_id(+)
                  and 1 = a.member_p(+)
                  and b.member_p = 0
UNION
select a.date_id, a.n_sessions_day, a.n_users as members, 
                  nvl(b.n_users, 0) as non_members
from cs_historical_visits_grouped a, cs_historical_visits_grouped b
where  a.n_sessions_day = b.n_sessions_day(+)
                  and a.date_id = b.date_id(+)
                  and a.member_p = 1
                  and 0 = b.member_p(+);
Here is our port of the cs_historical_visits_grouped view:
create view cs_historical_visits_grouped as 
select date_id, n_sessions_day, not_null_integer_p(user_id) as
member_p, 
       count(browser_id) as n_users 
from cs_historical_visits 
group by date_id, n_sessions_day;
not_null_integer_p is a function we created that basically does case when user_id is null then 1 else 0 end, but for some reason PG wouldn't take it in a view.

Here's a helper view for our aggregate function:

create view cs_hist_visits_grouped_txt as 
select date_id, n_sessions_day, member_p, n_users,
       member_p::varchar || ' ' || n_users::varchar as
member_p_n_users
from cs_historical_visits_grouped;
And here are the aggregate functions:
create function agg_if_member (integer, varchar)
returns integer as '
declare
	a_int alias for $1;
	a_varchar alias for $2;
	member_p integer;
	num_members integer;
begin
	member_p := substr(a_varchar, 1, 1)::integer;
	num_members := substr(a_varchar, 3)::integer;
	if member_p = 1 then
		return num_members;
	else
		return a_int;
	end if;
end;
' language 'plpgsql';

create function agg_if_not_member (integer, varchar)
returns integer as '
declare
	a_int alias for $1;
	a_varchar alias for $2;
	return_val integer;
	member_p integer;
	num_members integer;
begin
	member_p := substr(a_varchar, 1, 1)::integer;
	num_members := substr(a_varchar, 3)::integer;
	return_val := 0;
	if a_int is not null then
	   return_val := a_int;
	end if;
	if member_p = 0 and num_members > a_int then
	   return_val := num_members;
	end if;
	return return_val;
end;
' language 'plpgsql';

create aggregate num_members_or_zero (
       basetype = varchar,
       stype1 = integer,
       sfunc1 = agg_if_member,
       initcond1 = 0);

create aggregate num_non_members_or_zero (
       basetype = varchar,
       stype1 = integer,
       sfunc1 = agg_if_not_member,
       initcond1 = 0);
and our query was:
select date_id, n_sessions_day,
num_members_or_zero(member_p_n_users) as members,
num_non_members_or_zero(member_p_n_users) as non_members 
from foobar 
group by date_id, n_sessions_day;
Our fake table foobar had the same structure as our cs_hist_grouped_txt view has, and this was the dummy data we had inserted to play with:

date_id | n_sessions_day | member_p | n_users | member_p_n_users
---------+----------------+----------+---------+------------------
       1 |            500 |        0 |      50 | 0 50
       1 |            500 |        1 |      30 | 1 30
       2 |           2000 |        1 |     210 | 1 210
       2 |           2000 |        0 |     999 | 0 999   
and our output was:

 date_id | n_sessions_day |  members  | non_members
---------+----------------+-----------+-------------
       1 |            500 |        30 |   136394044
       2 |           2000 | 136394612 |   136394612 
the values for members is what we expected for date_id == 1, but the other ones are all goofy. Anybody has any ideas or worked with CREATE AGGREGATE before? Anybody ported something like this before?

Sorry about the long email and TIA.

Collapse
Posted by Dan Wickstrom on
" not_null_integer_p is a function we created that basically does case when user_id is null then 1 else 0 end, but for some reason PG wouldn't take it in a view. "

I haven't worked with aggregates, but I do know that pg's handling of null in pl/pgsql functions is broken. I believe that this should be fixed in pg 7.1, but you should probably check with the pg developers. The problem is that any time a null is detected in the input arguements, the result is set to null, and the function returns. As a result, it wouldn't work to encapsulate a "case when" statement in a function.