Forum OpenACS Development: Overview of PostgreSQL's limitations

Posted by Sebastian Skracic on 10/20/00 06:30 PM

I would appreciate if someone could come up with list of PostgreSQL's limitations that could interfere with ACS 4.0 porting effort, namely:

max. row width
max. key/index width
max. size of varchar
etc.

Finally, as a PostgreSQL newcomer (well I had some contacts with Postgres in previous life, but at that time Postgres had still been ridden with features like time-travel :-)) I'm very interested in PostgreSQL's procedural language(s) - how much can be accomplished with them, how do they do transactions, do they accept variable number of arguments etc. As you know, ACS 4 (Classic) relies heavily on Oracle PL/SQL wrappers to simplify programmer's interface to ACS objects, so having this things correctly ported to PostgreSQL would help a lot.

2: Response to Overview of PostgreSQL's limitations (response to 1)

Posted by Gilbert Wong on 10/20/00 08:17 PM

One of my major gripes with postgreSQL is that it doesn't seem to have a package feature like Oracle. It's a minor inconvenience for me because I want to keep the TCL scripts for my ACS 4.0 and OpenACS 4.0 modules as similar as possible. I'll leave the other questions to people more qualified to answer them.

3: Response to Overview of PostgreSQL's limitations (response to 1)

Posted by Dan Wickstrom on 10/20/00 09:01 PM

postgresql's pl/sql functions are pretty good at emulating most of the oracle functionality, but they do have a couple of serious limitations:

1. max number of args is 16. This used to be only 8 for pg versions before 7.0. A lot of the pl/sql functions in acs 4.0 have a large number of input arguements, but so far the largest number of input args that I have encountered is 16.

2. if any of the args are null, the function short-circuits and returns a null.

3. an pl/sql function is an implicit transaction. no transaction control is allowed inside of a pl/sql function. Either the whole function completes correctly, and is committed, or the whole thing is rolled back.

4. pg doesn't support packages, so we'll need a naming convention to avoid namespace conflicts. Also pg truncates names greater than a certain length, so it might be difficult to do something simple like substitute a '_' for a '.' when porting the package specific function calls.

4: Response to Overview of PostgreSQL's limitations (response to 1)

Posted by Roberto Mello on 10/20/00 09:41 PM

Complementing Dan's response:

Biggest function name is 30 characters long.

5. PG has no procedures. It only has functions that can only return one thing (AFAIK - hopefully I am wrong).

6. Postgres supports: PL/PgSQL (imitates PL/SQL but has many less features), PL/Tcl (nice to work with strings and arrays) and PL/Perl (never did anything with it).

Maybe you could tell us the same things about Interbase (in another thread). I am sure there are people other than me that are also curious.

5: Response to Overview of PostgreSQL's limitations (response to 1)

Posted by Alex Sokoloff on 10/22/00 10:04 PM

Let's not forget also that postgres doesn't have statement-level triggers _or_ packages, which limits what you can do with functions in other ways. For example, in Oracle you can populate an array of values in a "table" defined in a package header while you do a series of updates, and then process the table in a second procedure that is executed after the updates are done - which is a "standard hack for triggers which require selects on the mutating table", according to whoever wrote the bookmarks module.

6: Response to Overview of PostgreSQL's limitations (response to 1)

Posted by Dan Wickstrom on 10/22/00 10:19 PM

True, but postgresql doesn't have the mutating tables problem that oracle has, so this hack is not required. Check out the ecommerce module to see how postgresql is better than oracle in this case :).

7: Response to Overview of PostgreSQL's limitations (response to 1)

Posted by Don Baccus on 10/25/00 06:35 PM

A basic issue to answer (reminiscent of the past!) is whether or notto support PG 7.0, or bite the bullet and support PG 7.1 only for ACS4.0. PG 7.1 goes beta in a couple of weeks...I raise this because 7.1 includes a new function manager thatcorrectly handles NULLs, which would really help us out a lot (havingstruggled with this in the past).Name length is a compile-time configurable number, though we probablywant to stick with the default 32 characters.Max size of varchar is roughly the same as blocksize, which werecommend be set to 16KB. We've been using a compressed text type(which I believe Interbase does, too) so in practice you getconsiderably longer varchars. 7.1 will include a transparentlarge-object feature that will allow varchars of any length, andlargely remove the row length restrictions which have plagued PG inthe past (at some expense in performance, but certainly no worse andprobably better than the explicit CLOBs used in Oracle).Indices are limited to a bit over 2KB, which shouldn't be a problemunless aD now indexes keys over the old 700 or so byte limit Oracleallows with a default block size.

8: Response to Overview of PostgreSQL's limitations (response to 1)

Posted by Dan Wickstrom on 10/25/00 06:59 PM

And my favorite - pg 7.1 will have outer joins. From my experience, porting those outer joins consumes about 90% of my porting effort. For that reason alone, I would vote to use pg 7.1 beta for the porting effort.

9: Response to Overview of PostgreSQL's limitations (response to 1)

Posted by Don Baccus on 10/26/00 12:43 AM

I *think* outer joins will be there, but am uncertain. Also, they maynot support the Oracle syntax, though I've got some parser changesthat should fix that and perhaps they could be included...I shouldprobably download a 7.1 pre-beta and have a go at it.One minor addendum - we recommend folks use a 16KB blocksize (i.e. maxrow width of 16KB). However, the true PG max limit is 32KB, thoughfor OpenACS we've not seen any real need to compile with a 32KBblocksize as of yet.

10: What to do about CONNECT BY? (response to 1)

Posted by Sebastian Skracic on 10/31/00 06:10 PM

Thanks for all answers so far. I'm just sweating over CONNECT
BY constructs. Somewhere in openACS code I spotted a solution
which uses two (or three?) Pl/PgSql functions emulating LEVEL
and ROWNUM, or something like that.
<p>
Is this preferred way? Or should we create index table for
each parent/child structure which should be updated on
INSERT/DELETE/UPDATE to master table? Since PostgreSQL does
not suffer from "mutating" syndrome like Oracle, will it be
possible just to write few triggers that could fire
recursively and maintain the index (parent-child mapping)
table? Or something else?

11: Response to Overview of PostgreSQL's limitations (response to 1)

Posted by Dan Wickstrom on 10/31/00 06:55 PM

    Sebastian> Thanks for all answers so far.  I'm just sweating over
    Sebastian> CONNECT BY constructs.  Somewhere in openACS code I
    Sebastian> spotted a solution which uses two (or three?) Pl/PgSql
    Sebastian> functions emulating LEVEL and ROWNUM, or something like
    Sebastian> that.  Is this preferred way?

We've done some connect by and oracle pseudo-column emulation with recursive functions for pg. See the intranet, homepage and file-storage modules for examples. This is not optimal, but it could be improved by caching the order by field in a table column. Other alternatives have also been suggested. See this previous discussion

13: Re: Response to Overview of PostgreSQL's limitations (response to 11)

Posted by Beverly Connor on 08/07/03 02:31 AM

Sorry, ignore this.

12: Response to "What to do about CONNECT BY?" (response to 1)

Posted by Mark Dettinger on 11/01/00 08:45 PM

I wrote a new Tcl procedure db_dfs (dfs = depth-first search) that
simulates Oracle's "connect by" query. It works similar to db_foreach.

See http://dev.arsdigita.com/connect_by.tcl

Please let me know what you think. If this is a move in the right
direction, we could replace all calls of db_foreach that use a
"connect by" clause with the corresponding call of "db_dfs".
Then, we could provide two versions of db_dfs: one for Postgres
that simulates the "connect by" with a recursive algorithm and
one for Oracle that uses "connect by" as before.