Forum OpenACS Development: Porting ACS 4 PL/Sql packages

Collapse
Posted by Sebastian Skracic on

I'm seeking a method to port Oracle's pl/sql wrappers which manipulate ACS 4.0 objects.

Here's overview of a solution that sort of works:

  • PL/Tcl was used
  • I decided to 'pack' all input arguments into single TEXT string, and let Tcl unwind it with 'array set'
  • the idea is to have one PL/Tcl function per package which accepts just two arguments: method ('new', 'delete', ...) and packed input parameters as second arg (as stated above). This way we should avoid namespace pollution. Oh, and the return value is of type TEXT, into which id's returned from constructors can be safely coerced, while we can simply ignore return values from destructors.
Here's how it can be invoked:
SELECT pltcl_acs_attribute('create', '

   object_type     "''apm_package''"
   attribute_name  "''package_blah''"
   datatype        "''string''"
   pretty_name     "''Package Blah''"
   pretty_plural   "''Package Blahs''"

');
... resulting in:
NOTICE:  INSERT INTO acs_attributes
      (attribute_id, min_n_values, pretty_name, datatype,
attribute_name,
object_type, storage, max_n_values, sort_order, static_p,
pretty_plural)
    VALUES
      (17, 1, 'Package Blah', 'string', 'package_blah', 'apm_package',
'type_specific', 1, 1, 'f', 'Package Blahs')
 pltcl_acs_attribute
---------------------
 17
(1 row)
Here's the actual code: (here is the link to relevant part of classic ACS 4.0 data model)
CREATE FUNCTION pltcl_acs_attribute(text, text)
RETURNS text
AS '


switch -- $1 {

  create {

    array set pl_args $2
    if {![info exists pl_args(min_n_values)]} {
      set pl_args(min_n_values) 1
    }

    if {![info exists pl_args(max_n_values)]} {
      set pl_args(max_n_values) 1
    }

    if {![info exists pl_args(storage)]} {
      set pl_args(storage) ''type_specific''
    }

    if {![info exists pl_args(static_p)]} {
      set pl_args(static_p) ''f''
    }

    if {![info exists pl_args(sort_order)]} {
      spi_exec "
          SELECT COALESCE(MAX(sort_order), 1) AS v_sort_order
          FROM acs_attributes
          WHERE object_type = $pl_args(object_type)
          and attribute_name = $pl_args(attribute_name)"
      set pl_args(sort_order) $v_sort_order
    }

    spi_exec "SELECT NEXTVAL(''acs_attribute_id_seq'') AS
v_attribute_id"
    set pl_args(attribute_id) $v_attribute_id

    set target_list [list]
    set values_list [list]

    foreach key [array names pl_args] {
      lappend target_list $key
      lappend values_list $pl_args($key)
    }

    set sql "INSERT INTO acs_attributes
      ([join $target_list ", "])
    VALUES
      ([join $values_list ", "])"

    elog NOTICE $sql
    
    spi_exec $sql

    return $v_attribute_id

  }

  default {

    #  maybe throw exception here?
    return ""

  }

}

'
LANGUAGE 'pltcl';

Problems:
  • Constructing 'packed' argument is major pain in the ass. Just look at the number of single and double-quotes.
  • Again, the way the input arguments are 'packed' is huge security hole (SQL-wise). [quote ...] should be used carefully, because we might still want to pass NULL here and there.
  • Tcl code should be smarter about required arguments.
  • Tcl code should be smarter about defaults.
Collapse
Posted by Ben Adida on
I don't think this direction is very clean. Since we're going to have
to change the SQL calls anyways, I'm not sure we need to imitate
the package behavior of Oracle PL/SQL. We can just name
PL/SQL procedures in a standard way so as to imitate
packages, without this gnarly amount of Tcl. What do you think?
Collapse
Posted by Roberto Mello on
I was thinking about this very topic this week and I tend to agree with Ben.

Packages for ACS classic will have to be modified to work with OpenACS anyways, so we might as well not complicate our lives even more. It won't look at nifty as Oracle's but it will be easier to maintain.

What I haven't seen yet is a clean to have default values for the parameters passed to the function. We also can't make them NULL because the whole function short circuits and exits in those cases.

Collapse
Posted by Dan Wickstrom on
acs 4.0 function names seem to be on average quite long and descriptive. Since postresql truncates function names to 44 chars, we might need to rename some of the packages to something shorter, if we use the convention of converting package call functions from packageName.function to something like packageName_function.  In addtion, the package constructor methods seem to take a lot of arguements.  After a brief look around, I was able to find a package constructor that required 16 arguements.  16 is the limit for postgresql, so we might run into situations where arguement packing is required.
Collapse
Posted by Roberto Mello on
44 characters? Where? On my PG 7.0.2 it only lets me do 31 chars:
lbn=# create function a234567890123456789012345678901234567890() 
lbn-# returns varchar as '
lbn-# begin
lbn'# 	return ''blah'';
lbn'# end
lbn'# ' language 'plpgsql';
NOTICE:  identifier "a234567890123456789012345678901234567890" will be
 truncated to "a234567890123456789012345678901"
CREATE
Collapse
Posted by Dan Wickstrom on
You are right.  I must have been thinking of something else when I made the post.  In any event, postgresql will truncate identifiers.
Collapse
Posted by Steve Crossan on
Sorry if this is going over old ground, but oughtn't we to be thinking about some kind of separation between the database layer and the application code? I was disappointed to see this hadn't been done in ACS4.0 - Oracle specific code appearing in .tcl pages, rather than hidden behind a function layer - which would have made the ACS port easier too. But Open/ACS doesn't necessarily have to go down the same route - especially if we're thinking beyond Postgres to other databases.

Ben mentioned nsjava in an earlier post - could that be a way forward? Either create a Java db access layer based on JDBC (therefore portable?) and call that from the pages using nsjava - or hide this behind a tcl function layer too so that java calls don't have to appear in the .tcl pages. In this scenario creation and deletion of database objects could map to creation and deletion of Java objects which manage their db storage separately and hide this from the page writers.

Collapse
Posted by Roberto Mello on
I don't know if I understand your point Steve... AOLserver aolready has a database abstraction layer.

Even if we used JDBC, it would only be completely portable if none of the Oracle-specific features were used. But then we'd better go with just the AOLserver API.

Collapse
Posted by Steve Crossan on
It's true, AOLServer does have a kind of abstraction layer. But through calls like ns_db dml you can still write code that's specific to the database you're using in the tcl pages. My suggestion is that somehow this could be prevented. Staying within tcl, all database calls would be in functions rather than in pages, simplifying the process of porting. In fact when you do this sort of thing you end up implementing something like ODBC/JDBC functionality in Tcl. But at least you have one place to concentrate on to port to another database, rather than code scattered across script files everywhere.

The Java idea was only so that the scripting and webserver layers would then be (to some extent) language neutral. Most scripting languages can call Java these days. Plus you get an object model (which ACS 4.0 seems to want) without necessarily having to rely on the database to provide it.

Collapse
Posted by Steve Crossan on
Follow up to earlier posting - copy of something posted to ACS development:

On sql separation - I agree it does benefit the programmer to have the full freedom of sql in tcl pages (or whatever scripting language). But it would be nice (and maybe achievable) if this were restricted as much as possible to standard SQL. Instead of writing

db_multirow surveys survey_select {
              select survey_id, name
              from survsimp_surveys, acs_objects
              where object_id = survey_id
              and acs_permission.permission_p(object_id, $user_id,'survsimp_take_survey') = 't'
              and enabled_p = 't'
          }
In your tcl pages you'd write something like
db_permitted_multirow $user_id surveys survey_select {
             select survey_id, name
              from survsimp_surveys, acs_objects
              where object_id = survey_id
             and enabled_p = 't'
     }
The function db_permitted_multirow (or whatever) could still execute the functionality by tagging the call to acs_permission.permission_p onto the select clauses of the statement. But only this function would have to be ported, not every page that uses the permissions module. Which is every page. Cost: one more frame on the call stack. Benefit: much easier to port.

Incidentally, please consider all these postings as an offer of resources to the OpenACS port :)

Collapse
Posted by Roberto Mello on
I still don't think I get what you are trying to say... abstraction layers only are 100% portable if the SQL being sent to the databases you want to work with are supported by all of them. If you are using JDBC, ODBC, ns_db or whatever, and your SQL is not 100% standard and supported by all, then you'll still have to modify your programs.

From one side the new ArsDigita DB API will make porting easier... for example, now we don't have to go around modifying every single place where "SELECT sysdate FROM DUAL" and "SELECT a_seq.nextval FROM DUAL" appear. We just modify the declaration of the db_sysdate and db_nextval and we're set.

But the new DB API also brought new challenges like bind variables and a greater dependence on ns_ora.

On the db_multirow x db_permitted_multirow issue, I can't find db_permitted multirow in the DB API.

Collapse
Posted by Steve Crossan on
Roberto -

db_permitted_multirow doesn't currently exist, but I was suggesting it should. In general, Oracle specific sql should be hidden behind functions like these that take 'standard' SQL (in this case the select without the pl/sql call where clause). The aD versions of these functions still call the Oracle specific stuff - in this case they just tag the extra where clause on the end and send it off. But OpenACS can do its own thing. This is kind of like what you say about the utility of db_sysdate etc. - that sort of thing could be enforced across the whole codebase. calls to ns_ora and use of bind variables could be hidden behind this layer too.

Collapse
Posted by Don Baccus on
I'm not terribly worried about calls to ns_ora.  We can implement aTcl layer to map them to appropriate PG ns_db/ns_pg calls, or if wewish to support a modified PG driver can simply add ns_ora to it,appropriately mapping the "select" etc variants.
Collapse
14: Attempt #2 (response to 1)
Posted by Sebastian Skracic on
Here's another proposal for porting ACS 4.0 PL/SQL packages to PG:

For each function/procedure contained within a PL/SQL package we write pltcl or plpgsql function named as <package_name>__<function_name>. Sometimes function_name can be abbreviated when it's obvious from package_name, e.g. acs_object_type__create instead of acs_object_type__create_type.

However, we must take care of the following:

  • not to pass NULL as one of the arguments, because function will simply return NULL
  • whenever dynamic sql is required, we are forced to use pltcl (is this correct?)
  • whenever we want to _return_ NULL, we must resort to plpgsql (correct?)
So, here are the simple ones:
create function acs_attribute__add_description(text, text, text, text) returns text as '
  declare
    v_object_type       alias for $1;
    v_attribute_name    alias for $2;
    v_description_key   alias for $3;
    v_description       alias for $4;
  begin
    insert into acs_attribute_descriptions
     (object_type, attribute_name, description_key, description)
    values
     (v_object_type, v_attribute_name,
      v_description_key, v_description);
  end;
  return v_description;
' language 'plpgsql';



create function acs_attribute__drop_description(text, text, text) returns text as '
  declare
    v_object_type       alias for $1;
    v_attribute_name    alias for $2;
    v_description_key   alias for $3;
  begin
    delete from acs_attribute_descriptions
      where object_type = v_object_type
      and attribute_name = v_attribute_name
      and description_key = v_description_key;
  end;
  return v_description_key;
' language 'plpgsql';

Now, here's the complicated one. Once again, I'll try to find a workaround to Oracle's default value handling. Needless to say, it's ugly, again. I've created a simple pltcl function:
create function arg_pack(text, text) returns text as '
  return "[list $1 $2] "
' language 'pltcl' ;

So that Oracle's PL/SQL is tranformed in following way:

  Oracle:

   attr_id := acs_attribute.create_attribute(
        object_type => 'acs_object',
     attribute_name => 'object_type',
           datatype => 'string',
        pretty_name => 'Object Type',
      pretty_plural => 'Object Types'
   );

  Postgres:

   select acs_attribute__create(
     arg_pack('object_type', 'acs_object') ||
     arg_pack('attribute_name', 'object_type') ||
     arg_pack('datatype', 'string') ||
     arg_pack('pretty_name', 'Object Type') ||
     arg_pack('pretty_plural', 'Object Types')
   );

arg_pack is used to construct key/value pairs into array-like lists which could be then parsed from within plpgsql function, using arg_unpack:
create function arg_unpack(text, text, text) returns text as '
  declare
    packed_args           alias for $1;
    argument_name         alias for $2;
    default_value         alias for $3;
  begin
    if ... NULL indicator is set ... then
      return NULL;
    else
      if ... the argument is supplied ...  then
        return argument_value;
      else
        return default_value;
      end if;
    end if;
  end;
' language 'plpgsql' ;
arg_unpack accepts 3 arguments: the first one is packed argument list ready to be fed into array set, the second is the argument name (key) and the third is default value which is returned if key does not appear in packed argument list. The function is written in plpgsql so we can return genuine NULL from it.

Yes, but how can we explicitly set one named argument to NULL? Since we cannot pass NULL value to arg_pack (because complete packed argument list would be NULL-ified), we denote NULL argument by setting its "NULL indicator", i.e. another argument named <argument_name>.null to arbitrary value:

   select acs_object_type__create(
     arg_pack('object_type', 'acs_object') ||
     arg_pack('supertype.null', 'yes') ||
     arg_pack('pretty_name' , 'Object') ||
     arg_pack('pretty_plural' , 'Objects') ||
     arg_pack('table_name' , 'acs_objects') ||
     arg_pack('id_column' , 'object_id') ||
     arg_pack('package_name' , 'acs_object') ||
     arg_pack('name_method' , 'acs_object__default_name')
   );
Let's see how acs_object_type__create handles this:
create function acs_object_type__create(text) returns text as '

declare
    packed_args         alias for $1;
    v_object_type       acs_object_types.object_type%TYPE;
    v_pretty_name       acs_object_types.pretty_name%TYPE;
    v_pretty_plural     acs_object_types.pretty_plural%TYPE;
    v_supertype         acs_object_types.supertype%TYPE;
    v_table_name        acs_object_types.table_name%TYPE;
    v_id_column         acs_object_types.id_column%TYPE;
    v_package_name      acs_object_types.package_name%TYPE;
    v_abstract_p        acs_object_types.abstract_p%TYPE;
    v_type_extension_table acs_object_types.type_extension_table%TYPE;
    v_name_method       acs_object_types.name_method%TYPE;
begin
  select
    arg_unpack(packed_args, ''object_type''),
    arg_unpack(packed_args, ''pretty_name''),
    arg_unpack(packed_args, ''pretty_plural''),
    arg_unpack(packed_args, ''supertype'', ''acs_object''),
    arg_unpack(packed_args, ''table_name''),
    arg_unpack(packed_args, ''id_column''),
    arg_unpack(packed_args, ''package_name''),
    arg_unpack(packed_args, ''abstract_p'', ''f''),
    arg_unpack(packed_args, ''type_extension_table''),
    arg_unpack(packed_args, ''name_method'')
  into
    v_object_type,
    v_pretty_name,
    v_pretty_plural,
    v_supertype,
    v_table_name,
    v_id_column,
    v_package_name,
    v_abstract_p,
    v_type_extension_table,
    v_name_method;

    if v_package_name is null then
      v_package_name := v_object_type;
    end if;

    insert into acs_object_types
      (object_type, pretty_name, pretty_plural, supertype, table_name,
       id_column, abstract_p, type_extension_table, package_name,
       name_method)
    values
      (v_object_type, v_pretty_name, v_pretty_plural, v_supertype, v_table_name,
       v_id_column, v_abstract_p, v_type_extension_table, v_package_name,
       v_name_method);

    return v_object_type;
end;
' language 'plpgsql';
Collapse
Posted by Don Baccus on
How extensively does the ACS 4.0 PL/SQL code make use of default values?

I'm tempted to just get rid of them, and give the Oracle source back
to aD afterwards :)

Your hack feels like a heck of a lot more work for us compared to the
relatively minor inconvenience of foregoing the use of default values.

Besides which, default values make the code a bit more opaque for
someone who's not intimately familiar with the toolkit...though I have
to admit I use them all the time, myself (in languages that support
them, not PL/pgSQL obviously).

Collapse
16: Keeping things simple? (response to 1)
Posted by Ben Adida on
These are some very creative solutions here. However, I'm with Don on this one: let's keep things simple. A naming convention for the packages is definitely a good idea. Two underscores maybe? Sebastian seems to have suggested this, and I think it's a good idea.

With respect to default values: I think the excessive use of PL/Tcl could be a bad idea from a performance standpoint. I'm thinking, like Don (maybe I should just let him speak :), that we can deal without default values. I will look over the source code some more to double check this.

Most importantly, I think we should try a simple implementation first. If we find that this might be incomplete and a real pain in the butt, we should by all means go with something more complete However, my current understanding of the source code is that this won't be too bad.

Opinions? Dissents?

Collapse
Posted by Sebastian Skracic on
Now when I'm somewhat more familiar with core ACS data model, I agree that handling of default values, especially in combination with Pl/Tcl is unnecessarily complicated thing.

So I would suggest that all arguments should be retained in Postgres functions. We'll just have to make sure that all arguments are specified while we are porting some Oracle function/procedure invocation, with maybe two exceptions:

  • How to pass NULL value to the function?
  • I think it's worth the effort to overload function definitions for some functions/procedures that accepts 2-3 arguments, with 1-2 of them being optional, for example group_contains_p in groups-create.sql and site_node.node_id in site-nodes-create.sql.
Collapse
Posted by Dan Wickstrom on
Since the function manager for pg 7.1 has been rewritten so that it correctly handles nulls, we shouldn't have to do anything special for nulls.  Also, I think that aD consistently used db_null throughout there code, so we shouldn't have to worry about the converting empty strings to null values when porting.