Forum OpenACS Q&A: Response to Survey Package Expansion Proposal

Matthew, congratulations on an excellent start! I'd like to add some comments directed at the Big Picture, not at specifics of data modeling or implementation.

In addition to the two applications (graded tests for education; health surveys), there is a third large application (to which I'm directing much of my effort currently) -- generic data capture tools for clinical trials or other domains where the "survey" is intended to guide users through potentially complex data entry steps and generate "clean" data.

I'm not sure whether this third application is qualitatively different or just quantitatively different from the first. While there are "correct" answers in the education setting, in clinical trials, data needs range checking (for numeric data) or string matching (for textual data); this is rather different than simply designating a specific response to be the correct one for the question.

In addition, in clinical trials, a given "case report form" (CRF) may be composed of one or more atomic surveys that may be reused in other CRFs. This isn't exactly the same as defining different "sections" but rather amounts to creating another containing object into which to add individual surveys.

Also, in clinical trials it is essential that the user be able to annotate every response to every question if they choose. Furthermore, any change to the response to any question needs to be saved in a user id/timestamped audit record.

Another common situation is the need to add one or more responses to a question like this: "Enter the following for all hospitalizations since some date: date of admission, primary diagnosis, discharge hematocrit, etc" IE, instead of a question with a single response to be made, what is required is a (usually) dated collection of data needing a UI that allows the user to enter one or more collections of these data. This is similar to entering a line item in an invoice and needs similiar UIs and data modeling.

Next, "skip logic" is a key requirement that probably educational applications don't face (well, traditional ones anyway; all "adaptive testing" like the current SATs etc use it). This is complex since typically the entire questionnaire is already presented to the user, so something like Javascript would be needed to navigate the user through the survey. Otherwise the survey would have to be presented to the user one question at a time. However you implement it, there must be provision for determining which question should be delivered in response to any given user response.

There are probably other important requirements that aren't coming to mind immediately, but you get the drift. The question is: given the differences in these three applications, does it make sense to try to meet all the requirements in a single package or instead create three (er, two, since I agree that the first two function very similarly)? To add a bunch of machinery and columns for range checking and auditing would complexify the educational app, while failing to have that in the survey package would make it useless for the more complex applications.

Some other requirements that probably apply to all these settings is to have robust scheduling mechanisms, such that an admin can specify how frequently (just once, twice, n times) and on what interval (daily, monthly, yearly, etc) a user can respond to the survey. Also, there should be optional mechanisms that email invitations/assignments to the user, reminders when they haven't done the survey as expected, email thanks when they have, and email copies to the admin/teacher/researcher for each of these events.

There also are often two types of users -- "true" responders who are entering their own data/responses, and surrogate responders who are transcribing data into the system from paper forms (yes, this is far, far more common than you'd think). For the latter, more telegraphic, keyboard entry UIs are far preferable, while for the former, point&click UIs work better.

Finally, there are numerous scoring algorithms that should to be accommodated; there certainly are in the health survey setting. There needs to be a way to map one or more questions to a given scale and set how the responses to those questions are to be calculated into the scale's score. This might involve simple addition, multiplication, or may involve normalization into a 0-100 range (Likert scales). Some scales are calculated from other scales (means). Other scales are simply table look-ups (each response is mapped to an arbitrary value that needs to be specified, and a scale score involves adding/multiplying/finding the mean/etc of all these values) -- the SF12 is like this, eg.

Also complicating health surveys (and presumably other non-health surveys that get scored) is how to handle missing responses. If all questions must be answered, then there's no problem -- you reject the response until it's complete. But lots of surveys permit users to omit responses, and then impute values (eg the mean of the responses the user *did* provide) but also specify minimum number of responses for a given score that must be provided or else the scale is not scored.

Anyway, these are some additional requirements to be considered as we expand the survey package.