test-doc - Transactions

I OpenACS For Everyone
- I.1 High level information: What is OpenACS?
  - I.1.1 Overview
  - I.1.2 OpenACS Release Notes
- I.2 OpenACS: robust web development framework
  - I.2.1 Introduction
  - I.2.2 Basic infrastructure
  - I.2.3 Advanced infrastructure
  - I.2.4 Domain level tools
II Administrator's Guide
- II.2 Installation Overview
  - II.2.1 Basic Steps
  - II.2.2 Prerequisite Software
- II.3 Complete Installation
  - II.3.1 Install a Unix-like system and supporting software
  - II.3.2 Install Oracle 10g XE on debian
    - II.3.2.1 Install Oracle 8.1.7
  - II.3.3 Install PostgreSQL
  - II.3.4 Install AOLserver 4
  - II.3.5 Quick Install of OpenACS
    - II.3.5.1 Complex Install OpenACS 5.3
  - II.3.6 OpenACS Installation Guide for Windows2000
  - II.3.7 OpenACS Installation Guide for Mac OS X
- II.4 Configuring a new OpenACS Site
  - II.4.1 Installing OpenACS packages
  - II.4.2 Mounting OpenACS packages
  - II.4.3 Configuring an OpenACS package
  - II.4.4 Setting Permissions on an OpenACS package
  - II.4.5 How Do I?
  - II.4.6 Configure OpenACS look and feel with templates
- II.5 Upgrading
  - II.5.1 Overview
  - II.5.2 Upgrading 4.5 or higher to 4.6.3
  - II.5.3 Upgrading OpenACS 4.6.3 to 5.0
  - II.5.4 Upgrading an OpenACS 5.0.0 or greater installation
  - II.5.5 Upgrading the OpenACS files
  - II.5.6 Upgrading Platform components
- II.6 Production Environments
  - II.6.1 Starting and Stopping an OpenACS instance.
  - II.6.2 AOLserver keepalive with inittab
  - II.6.3 Running multiple services on one machine
  - II.6.4 High Availability/High Performance Configurations
  - II.6.5 Staged Deployment for Production Networks
  - II.6.6 Installing SSL Support for an OpenACS service
  - II.6.7 Set up Log Analysis Reports
  - II.6.8 External uptime validation
  - II.6.9 Diagnosing Performance Problems
- II.7 Database Management
  - II.7.1 Running a PostgreSQL database on another server
  - II.7.2 Deleting a tablespace
  - II.7.3 Vacuum Postgres nightly
- II.8 Backup and Recovery
  - II.8.1 Backup Strategy
  - II.8.2 Manual backup and recovery
  - II.8.3 Automated Backup
  - II.8.4 Using CVS for backup-recovery
- II.A Install Red Hat 8/9
- II.B Install additional supporting software
  - II.B.1 Unpack the OpenACS tarball
  - II.B.2 Initialize CVS (OPTIONAL)
  - II.B.3 Add PSGML commands to emacs init file (OPTIONAL)
  - II.B.4 Install Daemontools (OPTIONAL)
  - II.B.5 Install qmail (OPTIONAL)
  - II.B.6 Install Analog web file analyzer
  - II.B.7 Install nspam
  - II.B.8 Install Full Text Search
  - II.B.9 Install Full Text Search using Tsearch2
  - II.B.10 Install Full Text Search using OpenFTS (deprecated see tsearch2)
  - II.B.11 Install nsopenssl
  - II.B.12 Install tclwebtest.
  - II.B.13 Install PHP for use in AOLserver
  - II.B.14 Install Squirrelmail for use as a webmail system for OpenACS
  - II.B.15 Install PAM Radius for use as external authentication
  - II.B.16 Install LDAP for use as external authentication
  - II.B.17 Install AOLserver 3.3oacs1
- II.C Credits
  - II.C.1 Where did this document come from?
  - II.C.2 Linux Install Guides
  - II.C.3 Security Information
  - II.C.4 Resources
III For OpenACS Package Developers
- III.9 Development Tutorial
  - III.9.1 Creating an Application Package
  - III.9.2 Setting Up Database Objects
  - III.9.3 Creating Web Pages
  - III.9.4 Debugging and Automated Testing
- III.10 Advanced Topics
  - III.10.1 Write the Requirements and Design Specs
  - III.10.2 Add the new package to CVS
  - III.10.3 OpenACS Edit This Page Templates
  - III.10.4 Adding Comments
  - III.10.5 Admin Pages
  - III.10.6 Categories
  - III.10.7 Profile your code
  - III.10.8 Prepare the package for distribution.
  - III.10.9 Distributing upgrades of your package
  - III.10.10 Notifications
  - III.10.11 Hierarchical data
  - III.10.12 Using .vuh files for pretty urls
  - III.10.13 Laying out a page with CSS instead of tables
  - III.10.14 Sending HTML email from your application
  - III.10.15 Basic Caching
  - III.10.16 Scheduled Procedures
  - III.10.17 Enabling WYSIWYG
  - III.10.18 Adding in parameters for your package
  - III.10.19 Writing upgrade scripts
  - III.10.20 Connect to a second database
  - III.10.21 Future Topics
- III.11 Development Reference
  - III.11.1 OpenACS Packages
  - III.11.2 OpenACS Data Models and the Object System
  - III.11.3 The Request Processor
  - III.11.4 The OpenACS Database Access API
  - III.11.5 Using Templates in OpenACS
  - III.11.6 Groups, Context, Permissions
  - III.11.7 Writing OpenACS Application Pages
  - III.11.8 Parties in OpenACS
  - III.11.9 OpenACS Permissions Tediously Explained
  - III.11.10 Object Identity
  - III.11.11 Programming with AOLserver
  - III.11.12 Using Form Builder: building html forms dynamically
- III.12 Engineering Standards
  - III.12.1 OpenACS Style Guide
  - III.12.2 Release Version Numbering
  - III.12.3 Constraint naming standard
  - III.12.4 ACS File Naming and Formatting Standards
  - III.12.5 PL/SQL Standards
  - III.12.6 Variables
  - III.12.7 Automated Testing
- III.13 CVS Guidelines
  - III.13.1 Using CVS with OpenACS
  - III.13.2 OpenACS CVS Concepts
  - III.13.3 Contributing code back to OpenACS
  - III.13.4 Additional Resources for CVS
- III.14 Documentation Standards
  - III.14.1 OpenACS Documentation Guide
  - III.14.2 Using PSGML mode in Emacs
  - III.14.3 Using nXML mode in Emacs
  - III.14.4 Detailed Design Documentation Template
  - III.14.5 System/Application Requirements Template
- III.15 TCLWebtest
  - III.15.1 API test
  - III.15.2 Webtest
- III.16 Internationalization
  - III.16.1 Internationalization and Localization Overview
  - III.16.2 How Internationalization/Localization works in OpenACS
  - III.16.4 Design Notes
  - III.16.5 Translator's Guide
- III.D Using CVS with an OpenACS Site
IV For OpenACS Platform Developers
- IV.17 Kernel Documentation
  - IV.17.1 Overview
  - IV.17.2 Object Model Requirements
  - IV.17.3 Object Model Design
  - IV.17.4 Permissions Requirements
  - IV.17.5 Permissions Design
  - IV.17.6 Groups Requirements
  - IV.17.7 Groups Design
  - IV.17.8 Subsites Requirements
  - IV.17.9 Subsites Design Document
  - IV.17.10 Package Manager Requirements
  - IV.17.11 Package Manager Design
  - IV.17.12 Database Access API
  - IV.17.13 OpenACS Internationalization Requirements
  - IV.17.14 Security Requirements
  - IV.17.15 Security Design
  - IV.17.16 Security Notes
  - IV.17.17 Request Processor Requirements
  - IV.17.18 Request Processor Design
  - IV.17.19 Documenting Tcl Files: Page Contracts and Libraries
  - IV.17.20 Bootstrapping OpenACS
  - IV.17.21 External Authentication Requirements
- IV.18 Releasing OpenACS
  - IV.18.1 OpenACS Core and .LRN
  - IV.18.2 How to Update the OpenACS.org repository
  - IV.18.3 How to package and release an OpenACS Package
  - IV.18.4 How to Update the translations
V Tcl for Web Nerds
- V.1 Tcl for Web Nerds Introduction
- V.2 Basic String Operations
- V.3 List Operations
- V.4 Pattern matching
- V.5 Array Operations
- V.6 Numbers
- V.7 Control Structure
- V.8 Scope, Upvar and Uplevel
- V.9 File Operations
- V.10 Eval
- V.11 Exec
- V.12 Tcl for Web Use
- V.13 OpenACS conventions for TCL
- V.14 Solutions
VI SQL for Web Nerds
- VI.1 SQL Tutorial
  - VI.1.1 SQL Tutorial
  - VI.1.2 Answers
- VI.2 SQL for Web Nerds Introduction
- VI.3 Data modeling
  - VI.3.1 The Discussion Forum -- philg's personal odyssey
  - VI.3.2 Data Types (Oracle)
  - VI.3.4 Tables
  - VI.3.5 Constraints
- VI.4 Simple queries
- VI.5 More complex queries
- VI.6 Transactions
- VI.7 Triggers
- VI.8 Views
- VI.9 Style
- VI.10 Escaping to the procedural world
- VI.11 Trees

97.62%

· Index

VI.6 Transactions

In the introduction we covered some examples of inserting data into a database by typing at SQL*Plus:


insert into mailing_list (name, email)
values ('Philip Greenspun','philg@mit.edu');

Generally, this is not how it is done. As a programmer, you write code that gets executed every time a user submits a discussion forum posting or classified ad. The structure of the SQL statement remains fixed but not the string literals after the values.

The simplest and most direct interface to a relational database involves a procedural program in C, Java, Lisp, Perl, or Tcl putting together a string of SQL that is then sent to to the RDBMS. Here's how the ArsDigita Community System constructs a new entry in the clickthrough log:


insert into clickthrough_log 
 (local_url, foreign_url, entry_date, click_count)
values
 ('$local_url', '$foreign_url', trunc(sysdate), 1)"

The INSERT statement adds one row, filling in the four list columns. Two of the values come from local variables set within the Web server, $local_url and $foreign_url. Because these are strings, they must be surrounded by single quotes. One of the values is dynamic and comes straight from Oracle: trunc(sysdate). Recall that the date data type in Oracle is precise to the second. We only want one of these rows per day of the year and hence truncate the date to midnight. Finally, as this is the first clickthrough of the day, we insert a constant value of 1 for click_count.

Atomicity

Each SQL statement executes as an atomic transaction. For example, suppose that you were to attempt to purge some old data with


delete from clickthrough_log where entry_date + 120 < sysdate;

(delete clickthrough records more than 120 days old) and that 3500 rows in clickthrough_log are older than 120 days. If your computer failed halfway through the execution of this DELETE, i.e., before the transaction committed, you would find that none of the rows had been deleted. Either all 3500 rows will disappear or none will.

More interestingly, you can wrap a transaction around multiple SQL statements. For example, when a user is editing a comment, the ArsDigita Community System keeps a record of what was there before:


ns_db dml $db "begin transaction" 

# insert into the audit table
ns_db dml $db "insert into general_comments_audit
 (comment_id, user_id, ip_address, audit_entry_time, modified_date, content)
select comment_id, 
       user_id, 
       '[ns_conn peeraddr]', 
       sysdate, 
       modified_date, 
       content from general_comments 
where comment_id = $comment_id"

# change the publicly viewable table
ns_db dml $db "update general_comments
set content = '$QQcontent', 
    html_p = '$html_p'
where comment_id = $comment_id"

# commit the transaction
ns_db dml $db "end transaction"

This is generally referred to in the database industry as auditing. The database itself is used to keep track of what has been changed and by whom.

Let's look at these sections piece by piece. We're looking at a Tcl program calling AOLserver API procedures when it wants to talk to Oracle. We've configured the system to reverse the normal Oracle world order in which everything is within a transaction unless otherwise committed. The begin transaction and end transaction statements never get through to Oracle; they are merely instructions to our Oracle driver to flip Oracle out and then back into autocommit mode.

The transaction wrapper is imposed around two SQL statements. The first statement inserts a row into general_comments_audit. We could simply query the general_comments table from Tcl and then use the returned data to create a standard-looking INSERT. However, if what you're actually doing is moving data from one place within the RDBMS to another, it is extremely bad taste to drag it all the way out to an application program and then stuff it back in. Much better to use the "INSERT ... SELECT" form.

Note that two of the columns we're querying from general_comments don't exist in the table: sysdate and '[ns_conn peeraddr]'. It is legal in SQL to put function calls or constants in your select list, just as you saw at the beginning of the Queries chapter where we discussed Oracle's one-row system table: dual. To refresh your memory:


select sysdate from dual;

SYSDATE
----------
1999-01-14

You can compute multiple values in a single query:


select sysdate, 2+2, atan2(0, -1) from dual;

SYSDATE 	  2+2 ATAN2(0,-1)
---------- ---------- -----------
1999-01-14	    4  3.14159265

This approach is useful in the transaction above, where we combine information from a table with constants and function calls. Here's a simpler example:


select posting_time, 2+2 
from bboard 
where msg_id = '000KWj';

POSTING_TI	  2+2
---------- ----------
1998-12-13	    4

Let's get back to our comment editing transaction and look at the basic structure:

open a transaction
insert into an audit table whatever comes back from a SELECT statement on the comment table
update the comment table
close the transaction

Suppose that something goes wrong during the INSERT. The tablespace in which the audit table resides is full and it isn't possible to add a row. Putting the INSERT and UPDATE in the same RDBMS transactions ensures that if there is a problem with one, the other won't be applied to the database.

Consistency

Suppose that we've looked at a message on the bulletin board and decide that its content is so offensive we wish to delete the user from our system:


select user_id from bboard where msg_id = '000KWj';

   USER_ID
----------
     39685

delete from users where user_id = 39685;
*
ERROR at line 1:
ORA-02292: integrity constraint (PHOTONET.SYS_C001526) violated - child record
found

Oracle has stopped us from deleting user 39685 because to do so would leave the database in an inconsistent state. Here's the definition of the bboard table:


create table bboard (
	msg_id		char(6) not null primary key,
	refers_to	char(6),
        ...
	user_id		integer not null references users,
	one_line	varchar(700),
	message		clob,
	...
);

The user_id column is constrained to be not null. Furthermore, the value in this column must correspond to some row in the users table (references users). By asking Oracle to delete the author of msg_id 000KWj from the users table before we deleted all of his or her postings from the bboard table, we were asking Oracle to leave the RDBMS in an inconsistent state.

Mutual Exclusion

When you have multiple simultaneously executing copies of the same program, you have to think about mutual exclusion. If a program has to

read a value from the database
perform a computation based on that value
update the value in the database based on the computation

Then you want to make sure only one copy of the program is executing at a time through this segment.

The /bboard module of the ArsDigita Community System has to do this. The sequence is

read the last message ID from the msg_id_generator table
increment the message ID with a bizarre collection of Tcl scripts
update the last_msg_id column in the msg_id_generator table

First, anything having to do with locks only makes sense when the three operations are grouped together in a transaction. Second, to avoid deadlocks a transaction must acquire all the resources (including locks) that it needs at the start of the transaction. A SELECT in Oracle does not acquire any locks but a SELECT .. FOR UPDATE does. Here's the beginning of the transaction that inserts a message into the bboard table (from /bboard/insert-msg.tcl):


select last_msg_id
from msg_id_generator
for update of last_msg_id

Mutual Exclusion (the Big Hammer)

The for update clause isn't a panacea. For example, in the Action Network (described in Chapter 16 of Philip and Alex's Guide to Web Publishing), we need to make sure that a double-clicking user doesn't generate duplicate FAXes to politicians. The test to see if the user has already responded is


select count(*) from an_alert_log 
where member_id = $member_id
and entry_type = 'sent_response'
and alert_id = $alert_id

By default, Oracle locks one row at a time and doesn't want you to throw a FOR UPDATE clause into a SELECT COUNT(*). The implication of that would be Oracle recording locks on every row in the table. Much more efficient is simply to start the transaction with


lock table an_alert_log in exclusive mode

This is a big hammer and you don't want to hold a table lock for more than an instant. So the structure of a page that gets a table lock should be

open a transaction
lock table
select count(*)
if the count was 0, insert a row to record the fact that the user has responded
commit the transaction (releases the table lock)
proceed with the rest of the script
...

What if I just want some unique numbers?

Does it really have to be this hard? What if you just want some unique integers, each of which will be used as a primary key? Consider a table to hold news items for a Web site:


create table news (
	title		varchar(100) not null,
	body		varchar(4000) not null,
	release_date	date not null,
        ...
);

You might think you could use the title column as a key, but consider the following articles:


insert into news (title, body, release_date)
values
('French Air Traffic Controllers Strike',
 'A walkout today by controllers left travelers stranded..',
 '1995-12-14');

insert into news (title, body, release_date)
values
('French Air Traffic Controllers Strike',
 'Passengers at Orly faced 400 canceled flights ...',
 '1997-05-01');

insert into news (title, body, release_date)
values
('Bill Clinton Beats the Rap',
 'Only 55 senators were convinced that President Clinton obstructed justice ...',
 '1999-02-12');

insert into news (title, body, release_date)
values
('Bill Clinton Beats the Rap',
 'The sexual harassment suit by Paula Jones was dismissed ...',
 '1998-12-02);

It would seem that, at least as far as headlines are concerned, little of what is reported is truly new. Could we add

	primary key (title, release_date)

at the end of our table definition? Absolutely. But keying by title and date would result in some unwieldy URLs for editing or approving news articles. If your site allows public suggestions, you might find submissions from multiple users colliding. If you accept comments on news articles, a standard feature of the ArsDigita Community System, each comment must reference a news article. You'd have to be sure to update both the comments table and the news table if you needed to correct a typo in the title column or changed the release_date.

The traditional database design that gets around all of these problems is the use of a generated key. If you've been annoyed by having to carry around your student ID at MIT or your patient ID at a hospital, now you understand the reason why: the programmers are using generated keys and making their lives a bit easier by exposing this part of their software's innards.

Here's how the news module of the ArsDigita Community System works, in an excerpt from http://software.arsdigita.com/www/doc/sql/news.sql:


create sequence news_id_sequence start with 1;

create table news (
	news_id		integer primary key,
	title		varchar(100) not null,
	body		varchar(4000) not null,
	release_date	date not null,
	...
);

We're taking advantage of the nonstandard but very useful Oracle sequence facility. In almost any Oracle SQL statement, you can ask for a sequence's current value or next value.


SQL> create sequence foo_sequence;

Sequence created.

SQL> select foo_sequence.currval from dual;

ERROR at line 1:
ORA-08002: sequence FOO_SEQUENCE.CURRVAL is not yet defined in this session

Oops! Looks like we can't ask for the current value until we've asked for at least one key in our current session with Oracle.


SQL> select foo_sequence.nextval from dual;

   NEXTVAL
----------
	 1

SQL> select foo_sequence.nextval from dual;

   NEXTVAL
----------
	 2

SQL> select foo_sequence.nextval from dual;

   NEXTVAL
----------
	 3

SQL> select foo_sequence.currval from dual;

   CURRVAL
----------
	 3

You can use the sequence generator directly in an insert, e.g.,


insert into news (news_id, title, body, release_date)
values
(news_id_sequence.nextval, 
 'Tuition Refund at MIT',
 'Administrators were shocked and horrified ...',
 '1998-03-12);

Background on this story: http://philip.greenspun.com/school/tuition-free-mit.html

In the ArsDigita Community System implementation, the news_id is actually generated in /news/post-new-2.tcl:


set news_id [database_to_tcl_string $db "select news_id_sequence.nextval from dual"]

This way the page that actually does the database insert, /news/post-new-3.tcl, can be sure when the user has inadvertently hit submit twice:


if [catch { ns_db dml $db "insert into news
(news_id, title, body, html_p, approved_p, 
 release_date, expiration_date, creation_date, creation_user, 
 creation_ip_address)
values
($news_id, '$QQtitle', '$QQbody', '$html_p', '$approved_p',
 '$release_date', '$expiration_date', sysdate, $user_id, 
 '$creation_ip_address')" } errmsg] {
    # insert failed; let's see if it was because of duplicate submission
    if { [database_to_tcl_string $db "select count(*) 
                                      from news
                                      where news_id = $news_id"] == 0 } {
        # some error other than dupe submission
	ad_return_error "Insert Failed" "The database ..."
        return
    }
    # we don't bother to handle the cases where there is a dupe submission
    # because the user should be thanked or redirected anyway
}

In our experience, the standard technique of generating the key at the same time as the insert leads to a lot of duplicate information in the database.

Sequence Caveats

Oracle sequences are optimized for speed. Hence they offer the minimum guarantees that Oracle thinks are required for primary key generation and no more.

If you ask for a few nextvals and roll back your transaction, the sequence will not be rolled back.

You can't rely on sequence values to be, uh, sequential. They will be unique. They will be monotonically increasing. But there might be gaps. The gaps arise because Oracle pulls, by default, 20 sequence values into memory and records those values as used on disk. This makes nextval very fast since the new value need only be marked use in RAM and not on disk. But suppose that someone pulls the plug on your database server after only two sequence values have been handed out. If your database administrator and system administrator are working well together, the computer will come back to life running Oracle. But there will be a gap of 18 values in the sequence (e.g., from 2023 to 2041). That's because Oracle recorded 20 values used on disk and only handed out 2.

Oracle8 Server Application Developer's Guide, Controlling Transactions
Orace8 Server SQL Reference, CREATE SEQUENCE section

---

based on SQL for Web Nerds

Categories: beginner (Audience)