- I OpenACS For Everyone
- I.1 High level information: What is OpenACS?
- I.1.1 Overview
- I.1.2 OpenACS Release Notes
- I.2 OpenACS: robust web development framework
- I.2.1 Introduction
- I.2.2 Basic infrastructure
- I.2.3 Advanced infrastructure
- I.2.4 Domain level tools
- I.1 High level information: What is OpenACS?
- II Administrator's Guide
- II.2 Installation Overview
- II.2.1 Basic Steps
- II.2.2 Prerequisite Software
- II.3 Complete Installation
- II.3.1 Install a Unix-like system and supporting software
- II.3.2 Install Oracle 10g XE on debian
- II.3.2.1 Install Oracle 8.1.7
- II.3.3 Install PostgreSQL
- II.3.4 Install AOLserver 4
- II.3.5 Quick Install of OpenACS
- II.3.5.1 Complex Install OpenACS 5.3
- II.3.6 OpenACS Installation Guide for Windows2000
- II.3.7 OpenACS Installation Guide for Mac OS X
- II.4 Configuring a new OpenACS Site
- II.4.1 Installing OpenACS packages
- II.4.2 Mounting OpenACS packages
- II.4.3 Configuring an OpenACS package
- II.4.4 Setting Permissions on an OpenACS package
- II.4.5 How Do I?
- II.4.6 Configure OpenACS look and feel with templates
- II.5 Upgrading
- II.5.1 Overview
- II.5.2 Upgrading 4.5 or higher to 4.6.3
- II.5.3 Upgrading OpenACS 4.6.3 to 5.0
- II.5.4 Upgrading an OpenACS 5.0.0 or greater installation
- II.5.5 Upgrading the OpenACS files
- II.5.6 Upgrading Platform components
- II.6 Production Environments
- II.6.1 Starting and Stopping an OpenACS instance.
- II.6.2 AOLserver keepalive with inittab
- II.6.3 Running multiple services on one machine
- II.6.4 High Availability/High Performance Configurations
- II.6.5 Staged Deployment for Production Networks
- II.6.6 Installing SSL Support for an OpenACS service
- II.6.7 Set up Log Analysis Reports
- II.6.8 External uptime validation
- II.6.9 Diagnosing Performance Problems
- II.7 Database Management
- II.7.1 Running a PostgreSQL database on another server
- II.7.2 Deleting a tablespace
- II.7.3 Vacuum Postgres nightly
- II.8 Backup and Recovery
- II.8.1 Backup Strategy
- II.8.2 Manual backup and recovery
- II.8.3 Automated Backup
- II.8.4 Using CVS for backup-recovery
- II.A Install Red Hat 8/9
- II.B Install additional supporting software
- II.B.1 Unpack the OpenACS tarball
- II.B.2 Initialize CVS (OPTIONAL)
- II.B.3 Add PSGML commands to emacs init file (OPTIONAL)
- II.B.4 Install Daemontools (OPTIONAL)
- II.B.5 Install qmail (OPTIONAL)
- II.B.6 Install Analog web file analyzer
- II.B.7 Install nspam
- II.B.8 Install Full Text Search
- II.B.9 Install Full Text Search using Tsearch2
- II.B.10 Install Full Text Search using OpenFTS (deprecated see tsearch2)
- II.B.11 Install nsopenssl
- II.B.12 Install tclwebtest.
- II.B.13 Install PHP for use in AOLserver
- II.B.14 Install Squirrelmail for use as a webmail system for OpenACS
- II.B.15 Install PAM Radius for use as external authentication
- II.B.16 Install LDAP for use as external authentication
- II.B.17 Install AOLserver 3.3oacs1
- II.C Credits
- II.C.1 Where did this document come from?
- II.C.2 Linux Install Guides
- II.C.3 Security Information
- II.C.4 Resources
- II.2 Installation Overview
- III For OpenACS Package Developers
- III.9 Development Tutorial
- III.9.1 Creating an Application Package
- III.9.2 Setting Up Database Objects
- III.9.3 Creating Web Pages
- III.9.4 Debugging and Automated Testing
- III.10 Advanced Topics
- III.10.1 Write the Requirements and Design Specs
- III.10.2 Add the new package to CVS
- III.10.3 OpenACS Edit This Page Templates
- III.10.4 Adding Comments
- III.10.5 Admin Pages
- III.10.6 Categories
- III.10.7 Profile your code
- III.10.8 Prepare the package for distribution.
- III.10.9 Distributing upgrades of your package
- III.10.10 Notifications
- III.10.11 Hierarchical data
- III.10.12 Using .vuh files for pretty urls
- III.10.13 Laying out a page with CSS instead of tables
- III.10.14 Sending HTML email from your application
- III.10.15 Basic Caching
- III.10.16 Scheduled Procedures
- III.10.17 Enabling WYSIWYG
- III.10.18 Adding in parameters for your package
- III.10.19 Writing upgrade scripts
- III.10.20 Connect to a second database
- III.10.21 Future Topics
- III.11 Development Reference
- III.11.1 OpenACS Packages
- III.11.2 OpenACS Data Models and the Object System
- III.11.3 The Request Processor
- III.11.4 The OpenACS Database Access API
- III.11.5 Using Templates in OpenACS
- III.11.6 Groups, Context, Permissions
- III.11.7 Writing OpenACS Application Pages
- III.11.8 Parties in OpenACS
- III.11.9 OpenACS Permissions Tediously Explained
- III.11.10 Object Identity
- III.11.11 Programming with AOLserver
- III.11.12 Using Form Builder: building html forms dynamically
- III.12 Engineering Standards
- III.12.1 OpenACS Style Guide
- III.12.2 Release Version Numbering
- III.12.3 Constraint naming standard
- III.12.4 ACS File Naming and Formatting Standards
- III.12.5 PL/SQL Standards
- III.12.6 Variables
- III.12.7 Automated Testing
- III.13 CVS Guidelines
- III.13.1 Using CVS with OpenACS
- III.13.2 OpenACS CVS Concepts
- III.13.3 Contributing code back to OpenACS
- III.13.4 Additional Resources for CVS
- III.14 Documentation Standards
- III.14.1 OpenACS Documentation Guide
- III.14.2 Using PSGML mode in Emacs
- III.14.3 Using nXML mode in Emacs
- III.14.4 Detailed Design Documentation Template
- III.14.5 System/Application Requirements Template
- III.15 TCLWebtest
- III.16 Internationalization
- III.16.1 Internationalization and Localization Overview
- III.16.2 How Internationalization/Localization works in OpenACS
- III.16.4 Design Notes
- III.16.5 Translator's Guide
- III.D Using CVS with an OpenACS Site
- III.9 Development Tutorial
- IV For OpenACS Platform Developers
- IV.17 Kernel Documentation
- IV.17.1 Overview
- IV.17.2 Object Model Requirements
- IV.17.3 Object Model Design
- IV.17.4 Permissions Requirements
- IV.17.5 Permissions Design
- IV.17.6 Groups Requirements
- IV.17.7 Groups Design
- IV.17.8 Subsites Requirements
- IV.17.9 Subsites Design Document
- IV.17.10 Package Manager Requirements
- IV.17.11 Package Manager Design
- IV.17.12 Database Access API
- IV.17.13 OpenACS Internationalization Requirements
- IV.17.14 Security Requirements
- IV.17.15 Security Design
- IV.17.16 Security Notes
- IV.17.17 Request Processor Requirements
- IV.17.18 Request Processor Design
- IV.17.19 Documenting Tcl Files: Page Contracts and Libraries
- IV.17.20 Bootstrapping OpenACS
- IV.17.21 External Authentication Requirements
- IV.18 Releasing OpenACS
- IV.18.1 OpenACS Core and .LRN
- IV.18.2 How to Update the OpenACS.org repository
- IV.18.3 How to package and release an OpenACS Package
- IV.18.4 How to Update the translations
- IV.17 Kernel Documentation
- V Tcl for Web Nerds
- V.1 Tcl for Web Nerds Introduction
- V.2 Basic String Operations
- V.3 List Operations
- V.4 Pattern matching
- V.5 Array Operations
- V.6 Numbers
- V.7 Control Structure
- V.8 Scope, Upvar and Uplevel
- V.9 File Operations
- V.10 Eval
- V.11 Exec
- V.12 Tcl for Web Use
- V.13 OpenACS conventions for TCL
- V.14 Solutions
- VI SQL for Web Nerds
- VI.1 SQL Tutorial
- VI.1.1 SQL Tutorial
- VI.1.2 Answers
- VI.2 SQL for Web Nerds Introduction
- VI.3 Data modeling
- VI.3.1 The Discussion Forum -- philg's personal odyssey
- VI.3.2 Data Types (Oracle)
- VI.3.4 Tables
- VI.3.5 Constraints
- VI.4 Simple queries
- VI.5 More complex queries
- VI.6 Transactions
- VI.7 Triggers
- VI.8 Views
- VI.9 Style
- VI.10 Escaping to the procedural world
- VI.11 Trees
- VI.1 SQL Tutorial
VI.3.1 The Discussion Forum -- philg's personal odyssey
Back in 1995, I built a threaded discussion forum, described ad nauseum in http://philip.greenspun.com/wtr/dead-trees/53013.htm. Here's how I stored the postings:German order reigns inside the system itself: messages are uniquely keyed with
create table bboard (
msg_id char(6) not null primary key,
refers_to char(6),
email varchar(200),
name varchar(200),
one_line varchar(700),
message clob,
notify char(1) default 'f' check (notify in ('t','f')),
posting_time date,
sort_key varchar(600)
);
msg_id
, refer to each other (i.e., say "I'm a
response to msg X") with refers_to
, and a thread can be
displayed conveniently by using the sort_key
column.
Italian chaos is permitted in the email
and
name
columns; users could remain anonymous, masquerade as
"president@whitehouse.gov" or give any name.
This seemed like a good idea when I built the system. I was concerned that it work reliably. I didn't care whether or not users put in bogus content; the admin pages made it really easy to remove such postings and, in any case, if someone had something interesting to say but needed to remain anonymous, why should the system reject their posting?
One hundred thousand postings later, as the moderator of the photo.net Q&A forum, I began to see the dimensions of my data modeling mistakes.
First, anonymous postings and fake email addresses didn't come from Microsoft employees revealing the dark truth about their evil bosses. They came from complete losers trying and failing to be funny or wishing to humiliate other readers. Some fake addresses came from people scared by the rising tide of spam email (not a serious problem back in 1995).
Second, I didn't realize how the combination of my email alert systems, fake email addresses, and Unix mailers would result in my personal mailbox filling up with messages that couldn't be delivered to "asdf@asdf.com" or "duh@duh.net".
Although the solution involved changing some Web scripts, fundamentally the fix was add a column to store the IP address from which a post was made:
Keeping these data enabled me to see that most of the anonymous posters were people who'd been using the forum for some time, typically from the same IP address. I just sent them mail and asked them to stop, explaining the problem with bounced email.
alter table bboard add (originating_ip varchar(16));
After four years of operating the photo.net community, it became apparent that we needed ways to
- display site history for users who had changed their email addresses
- discourage problem users from burdening the moderators and the community
- carefully tie together user-contributed content in the various subsystems of photo.net
Note that a contributor's name and email address no longer appear in the
create table users (
user_id integer not null primary key,
first_names varchar(100) not null,
last_name varchar(100) not null,
email varchar(100) not null unique,
..
);
create table bboard (
msg_id char(6) not null primary key,
refers_to char(6),
topic varchar(100) not null references bboard_topics,
category varchar(200), -- only used for categorized Q&A forums
originating_ip varchar(16), -- stored as string, separated by periods
user_id integer not null references users,
one_line varchar(700),
message clob,
-- html_p - is the message in html or not
html_p char(1) default 'f' check (html_p in ('t','f')),
...
);
create table classified_ads (
classified_ad_id integer not null primary key,
user_id integer not null references users,
...
);
bboard
table. That doesn't mean we don't know who posted a
message. In fact, this data model can't even represent an anonymous
posting: user_id integer not null references users
requires that each posting be associated with a user ID and that there
actually be a row in the users
table with that ID.
First, let's talk about how much fun it is to move a live-on-the-Web
600,000 hit/day service from one data model to another. In this case,
note that the original bboard
data model had a single
name
column. The community system has separate columns for
first and last names. A conversion script can easily split up "Joe
Smith" but what is it to do with William Henry Gates
III?
How do we copy over anonymous postings? Remember that Oracle is not
flexible or intelligent. We said that we wanted every row in the
bboard
table to reference a row in the users
table. Oracle will abort any transaction that would result in a
violation of this integrity constraint. So we either have to drop all
those anonymous postings (and any non-anonymous postings that refer to
them) or we have to create a user called "Anonymous" and assign all the
anonymous postings to that person. The technical term for this kind of
solution is kludge.
A more difficult problem than anonymous postings is presented by long-time users who have difficulty typing and or keeping a job. Consider a user who has identified himself as
- Joe Smith; jsmith@ibm.com
- Jo Smith; jsmith@ibm.com (typo in name)
- Joseph Smith; jsmth@ibm.com (typo in email)
- Joe Smith; cantuseworkaddr@hotmail.com (new IBM policy)
- Joe Smith-Jones; joe_smithjones@hp.com (got married, changed name, changed jobs)
- Joe Smith-Jones; jsmith@somedivision.hp.com (valid but not canonical corporate email address)
- Josephina Smith; jsmith@somedivision.hp.com (sex change; divorce)
- Josephina Smith; josephina_smith@hp.com (new corporate address)
- Siddhartha Bodhisattva; josephina_smith@hp.com (change of philosophy)
- Siddhartha Bodhisattva; thinkwaitfast@hotmail.com (traveling for awhile to find enlightenment)
Once we've copied everything into this new normalized data model, notice that we can't dig ourselves into the same hole again. If a user has contributed 1000 postings, we don't have 1000 different records of that person's name and email address. If a user changes jobs, we need only update one column in one row in one table.
The html_p
column in the new data model is worth
mentioning. In 1995, I didn't understand the problems of user-submitted
data. Some users will submit plain text, which seems simple, but in
fact you can't just spit this out as HTML. If user A typed < or >
characters, they might get swallowed by user B's Web browser. Does this
matter? Consider that "<g>" is interpreted in various online
circles as an abbreviation for "grin" but by Netscape Navigator as an
unrecognized (and therefore ignore) HTML tag. Compare the meaning of
"We shouldn't think it unfair that Bill Gates has more wealth than the 100 million poorest Americans combined. After all, he invented the personal computer, the graphical user interface, and the Internet."with
"We shouldn't think it unfair that Bill Gates has more wealth than the 100 million poorest Americans combined. After all, he invented the personal computer, the graphical user interface, and the Internet. <g>"
It would have been easy enough for me to make sure that such characters
never got interpreted as markup. In fact, with AOLserver one can do it
with a single call to the built-in procedure ns_quotehtml
.
However, consider the case where a nerd posts some HTML. Other users
would then see
"For more examples of my brilliant thinking and modesty, check out <a href="http://philip.greenspun.com/">my home page</a>."I discovered that the only real solution is to ask the user whether the submission is an HTML fragment or plain text, show the user an approval page where the content may be previewed, and then remember what the user told us in an
html_p
column in the database.
Is this data model perfect? Permanent? Absolutely. It will last for at least... Whoa! Wait a minute. I didn't know that Dave Clark was replacing his original Internet Protocol, which the world has been running since around 1980, with IPv6 (http://www.faqs.org/rfcs/rfc2460.html). In the near future, we'll have IP addresses that are 128 bits long. That's 16 bytes, each of which takes two hex characters to represent. So we need 32 characters plus at least 7 more for periods that separate the hex digits. We might also need a couple of characters in front to say "this is a hex representation". Thus our brand new data model in fact has a crippling deficiency. How easy is it to fix? In Oracle:
You won't always get off this easy. Oracle won't let you shrink a column from a maximum of 50 characters to 16, even if no row has a value longer than 16 characters. Oracle also makes it tough to add a column that is constrained
alter table bboard modify (originating_ip varchar(50));
not null
.