Forum OpenACS Development: ways to insert and update db

Collapse
Posted by Torben Brosten on
Hi,

I'm exploring ways to add a generic logging feature to db procs (on the application side instead of using db triggers).

What other public registered procs are there to insert or update the db from an OpenACS thread, besides db_dml?

Any other private registered procs besides db_exec and db_exec_plsql?

Also, are ns_pg_bind, ns_ora, ns_db, and ns_insertrow fairly resilient against exploiting in the context of OpenACS applications, since they require db handles?

thanks in advance,
Torben

Collapse
Posted by Tom Jackson on

I think the best way to do it is to turn on sql logging (verbose=true) for the db pools you want to log. Then, I believe there are perl scripts which can extract the db statements, otherwise the error log can be examined. If you use multilog with AOLserver, you could probably put in a filter to place db statements into a special file.

This method covers all db drivers and is easy to turn on/off per db pool.

ns_section "ns/db/pool/$pool"
ns_param maxidle           1000000000
ns_param maxopen           1000000000
ns_param connections       5
ns_param verbose           true ;# false
ns_param extendedtableinfo true
ns_param logsqlerrors      true ;# false
ns_param driver            "$poolDriver"
ns_param user              tom
ns_param datasource        localhost::remodel
ns_param password          *
Collapse
Posted by Torben Brosten on
That may be the best way pg-wise, Tom, but I neglected to include some scoping factors:

1. This logging is only for non-core packages, of applications that use the logging feature.

2. The logging should not be db specific (Oracle compatibility is a long-term goal).

A trigger would be nice ( http://philip.greenspun.com/sql/triggers.html is inspiring ), if it could capture the sql directly and save it into a general logging facility, but then come these questions: Is there a way to have a trigger capture sql for pg and oracle, and stuff it in a general table? How would this work for scheduled procs that don't supply user_id and other connection info we want logged in the "audit"/log table?

Unless those can be answered positively, I'm back to asking the first questions..

What other public registered procs are there to insert or update the db from an OpenACS thread, besides db_dml?

Collapse
Posted by Torben Brosten on
hmm.. since tcllib is now required with core.. perhaps ::comm::comm is another possibility
Collapse
Posted by Tom Jackson on
Is it possible to provide more information on what you are trying to do? Are you considering modifying the db_* API? If so, it is better to speak up now, as it would affect everyone. Otherwise, you do know that every database type uses ns_db and the database module to communicate, right? Therefore, the logging I suggested would cover every database type.

But I haven't seen a statement of the issue yet which lends itself to providing all the information you want. Even the db_* API is necessarily removed from users/connections, and generally it is difficult to separate API calls via package, since one goal is to reuse code. The best single place to figure out all this is the server.log file. Most versions of AOLserver print the thread id and/or the thread name with each log statement. It should be possible to either add a log statement at the beginning of each request you want to log, or for every request and then parse the file to look for specific information. It does take up more disk space, and might be a little slower to log everything, but it is more generic and easier to adapt to future needs. For instance, you can look back to a previous day to figure out what was going on. If you target your logging, you are likely to limit it in arbitrary ways. And if you don't want to use it, it is easy to turn off.

One change I could see making life easier would be to use separate database pools for scheduled procs as opposed to connection threads, the scheduled procs are easy to track.

Collapse
Posted by Torben Brosten on
I don't want to modify the acs-core and effect everyone's systems arbitrarily by modifying any existing procs.

My intentions are really 2 fold:

1. Create a procedure that can be run periodically to alert admins to code, db content, or user input that may be dubious or otherwise weaken a systems integrity. This is not meant to be bullet proof. it is just another way to help manage security. Daveb reminded me on irc yesterday that a filter such as might be used in templating, should limit procs to only those that are authorized (similar to AllowedAttribute AllowedProtocol AllowedTag kernel parameters) instead of trying to identify only known vulnerabilities. This proc is more an extension of the automated testing, for regular auditing of code and db.

2. Create parallel db commands that log inserts and updates for all packages that choose to use the logging versions instead of db_dml etc. This method could also provide an example for anyone who might want to extend the logging into db_dml etc. to meet certain additional deployment requirements.

Collapse
Posted by Tom Jackson on
I hate sounding like a broken record, but this is exactly what the built in logging for database drivers does for you. The only thing missing is the intelligence of what to look for, which by the way is the most difficult part. Right now, you can turn on logging and periodically review the logs. What would you look for? If you can't do it by hand, you will never be able to program it into code. But even if you can do it by hand, that doesn't mean you can create a program to do it. Programs can't be smarter than their author, although they can be faster.

Logging has a completely different focus than normal operations. If everything worked perfectly, you wouldn't need logging. So one requirement of a logging facility is that it is somewhat out of the hands of the normal user. First they may not know to use it, and second, they may want to disable it so they can go undetected while they do bad things. Admins usually have to hunt through log files looking for clues to bad behavior and often is isn't known beforehand what to look for. If you create a selective logging facility, you limit your ability to apply any intelligence to the analysis.

But beyond this, I can't think of many package which allow users to choose what sql or procs/scripts to run. Actually the only one I can think of is my cronjob package. It specifically runs only with admin privileges, but the original goal was to allow regular users to submit code which could run after approval was obtained. The data model is in place where each cronjob can be disabled/enabled or approved/not approved, just no regular user pages to manage it. But this package places the review prior to execution and review/approval must be done by an admin. This is also a good place to put in regularly/sometimes run auditing/reporting code, since you can also select any cronjob and run it instantly.

On the database side it is more difficult to be generic. Each database is different and logging data must require a separate set of tables, which eventually slow down the system. Triggers work well, but there are also log files for the database. They can probably be more selective than the ns_db logging, but also they contain less context. The database doesn't know anything about users or packages.

Collapse
Posted by Torben Brosten on
Tom, I am not discounting your statements. I pretty much agree with them. This exercise is part of creating a general form of https://openacs.org/doc/ecommerce/audit that can be used by multiple packages ( http://dekka.com/etp/ecommerce-g2-package-list ) while not overly complicating the data model.