Forum OpenACS Development: Error whether using POSIX-style or SIMILAR TO for RE's

Hi there, I'm trying to run the following query, on OACS layer, whether using POSIX-style RE or SIMILAR TO RE, to match string patterns, however it sill returns the error below.

1) POSIX-style RE

db_0or1row select_revision "SELECT cr.title,  
    MAX(o.creation_date) AS creation_date, COUNT(*) AS occurency
    FROM cr_items ci, cr_revisionsx cr, acs_objects o 
    WHERE ci.item_id = cr.item_id 
    AND ci.item_id = o.object_id AND ci.content_type = 'qt_vehicle' 
    AND cr.title !~ '^[0-9]'  
    AND cr.title NOT IN ('UNKNOWN', 'FBF724','FBF124') 
    GROUP BY cr.title HAVING COUNT(*) > 1 
    ORDER BY COUNT(*) DESC  ;" -column_array revisions                                                                                                                                                                                        

2) or SIMILAR TO RE

db_0or1row select_revision "SELECT cr.title,  
    MAX(o.creation_date) AS creation_date, COUNT(*) AS occurency
    FROM cr_items ci, cr_revisionsx cr, acs_objects o 
    WHERE ci.item_id = cr.item_id 
    AND ci.item_id = o.object_id AND ci.content_type = 'qt_vehicle' 
    AND cr.title NOT SIMILAR TO '^[0-9]'  
    AND cr.title NOT IN ('UNKNOWN', 'FBF724','FBF124') 
    GROUP BY cr.title HAVING COUNT(*) > 1 
    ORDER BY COUNT(*) DESC  ;" -column_array revisions                                                                                                                                                 

p.s. The very same query runs just fine in the PSQL prompt command line.

 title  |     creation_date      | occurency 
--------+------------------------+-----------
 111111 | 2020-09-18 17:21:31+00 |      1340
 849500 | 2020-09-18 14:49:04+00 |       165
 411111 | 2020-09-18 16:53:23+00 |       120
 311111 | 2020-09-18 16:47:35+00 |        54
 333333 | 2020-09-18 14:19:34+00 |        50
 502246 | 2020-08-14 13:29:39+00 |        29
 502106 | 2020-08-13 10:07:24+00 |        28
 365507 | 2020-08-10 18:16:22+00 |        25
 900763 | 2020-09-01 10:21:08+00 |        22
 211111 | 2020-09-15 11:07:17+00 |        21
 895030 | 2020-09-17 10:15:28+00 |        20
 184950 | 2020-09-18 14:38:22+00 |        17

Is there any change to how we write PGSQL queries using regular expression in POSIX-style or SIMILAR TO ?

Best wishes, I

Reference: https://www.postgresql.org/docs/9.3/functions-matching.html

ERROR LOG

[18/Sep/2020:19:03:12][18107.7efbf3d70700][-conn:qonteo:default:0:121-] Error: invalid command name "0-9"
:        while executing
:    "0-9"
:        ("uplevel" body line 92)
:        invoked from within
:    "uplevel {
:        ad_page_contract {
:        @author Neophytos Demetriou <k2pts@cytanet.com.cy>
:        @creation-date September 01, 2001
:        @cvs-id $Id: search...."
:        (procedure "code::tcl::/var/www/qonteo/oacs//packages/qt-rest/www/search" line 2)
:        invoked from within
:    "code::tcl::$__adp_stub"
:        ("uplevel" body line 12)
:        invoked from within
:    "uplevel {
:    
:            if { [file exists $__adp_stub.tcl] } {
:    
:                # ensure that data source preparation procedure exists and is up-to-date
:          ..."
:        (procedure "adp_prepare" line 2)
:        invoked from within
:    "adp_prepare"
:        invoked from within
:    "template::adp_parse $themed_template {}"
:        (procedure "adp_parse_ad_conn_file" line 14)
:        invoked from within
:    "$handler"
:        ("uplevel" body line 2)
:        invoked from within
:    "uplevel $code"
:        invoked from within
:    "ad_try {
:                    $handler
:                } ad_script_abort val {
:                    # do nothing
:                }"
:        invoked from within
:    "rp_serve_concrete_file [ad_conn file]"
:        (procedure "::nsf::procs::rp_serve_abstract_file" line 60)
:        invoked from within
:    "rp_serve_abstract_file "$root/$extra_url""
:        ("uplevel" body line 2)
:        invoked from within
:    "uplevel $code"
:        invoked from within

This is actually a Tcl quoting question: Since you provided the SQL query between double quotes ("), Tcl applies substitution, finds something that looks like a Tcl command via command substitution and tries to execute the command 0-9.

In case, you do not want Tcl substitution, use curly braces like here:


db_0or1row select_revision {
  SELECT cr.title,  
    MAX(o.creation_date) AS creation_date, 
    COUNT(*) AS occurrence
  FROM cr_items ci, cr_revisionsx cr, acs_objects o 
  WHERE ci.item_id = cr.item_id 
    AND ci.item_id = o.object_id AND ci.content_type = 'qt_vehicle' 
    AND cr.title !~ '^[0-9]'  
    AND cr.title NOT IN ('UNKNOWN', 'FBF724','FBF124') 
  GROUP BY cr.title HAVING COUNT(*) > 1 
  ORDER BY COUNT(*) DESC
}

In general, Tcl substitutions should be avoided in SQL statements (or applied with special care), since these are potential attack vectors for SQL injection. Bind variables should be used instead wherever possible.

-g

PS: the markup character for code is for markdown a backtick (`) not a single quote (').

... and I agree! Each code chunk must be in its respective layer ( preferably in its own file). Meaning TCL must be in the TCL layer (.tcl files) as well ADP and SQL. Plus, I've always preferred to use curly braces. It's even better/easier to indent code in Emacs editor.

The bad habit took place when I was learning to pass array variables within SQL blocks. Using double quotes becomes much simpler. For example:

 db_foreach select_groups {
     select ap.package_id, r.object_id_one as group_id, g.group_name, mr.member_state
                from   acs_rels r,
                membership_rels mr,
                groups g,
                application_groups ap
                where  r.rel_type      = 'membership_rel'
                and    r.object_id_two = $auth_info(user_id)
                and    mr.rel_id   = r.rel_id
                and    g.group_id  = r.object_id_one
                and    ap.group_id = g.group_id
                order by lower(g.group_name)
            " { ... }

As a matter of fact, I have no idea how one would pass TCL arrays within SQL blocks. Neither :authinfo(userid) nor :authinfo.userid work. Both return syntax error.

[21/Sep/2020:23:12:56][13720.7efbf356f700][-conn:qonteo:default:1:0-] Error: undefined variable 'auth_info' : while executing ...

The bindvars support of the PostgreSQL driver is modeled after the Oracle bindvars support to provide compatibility. In Oracle, only scalar values are allowed (no Tcl arrays, dicts, etc.).

The OpenACS API supports the "-bind" argument to db_foreach, such one can write like:


db_foreach select_groups {
    select ap.package_id, ...
    from ...
    where  r.object_id_two = :user_id
    ...
} -bind [list user_id $auth_info(user_id)] {
    # loop body
}

or even


db_foreach select_groups {
    select ap.package_id, ...
    from ...
    where  r.object_id_two = :user_id
    ...
} -bind [array get auth_info] {
    # loop body
}

The advantage of the explicit list of bind names and values is that these can be passed around between procs without the danger of interfering with other Tcl variables. When when the variables and values form a Tcl dict, also "dict foreach" can be used.

PS: It seems that the "-bind" functionality for db_foreach was lost over the years. i've just committed a patch to oacs-5-10 to revive it.