Assessment functional requirements

Introduction

The assessment module provides OpenACS with capabilities to conduct surveys, tests and dynamic information gathering in general, as can be seen in the use cases.

Vision Statement

The motivation behind the Assessment package is to extend the functionality of the existing Survey package in both depth and breadth:

more question formats, user response filtering and processing, versioning, import/export capabilities for standards-based exchange with non-OpenACS systems, etc.
mechanisms to embed Assessment capabilities within other OpenACS packages and to assemble larger systems of packages within which Assessment is one component (eg dotLRN, clinical trials management systems, etc)

The current Survey package is a very capable piece of engineering that provides stand-alone data collection functions. It is subsite-aware and has been integrated to some extent with portlets. It also is just being integrated into user registration processes. These efforts point the path down which the Assessment package intends to proceed to its logical conclusion.

Development efforts for Assessment thus involve two tracks:

refinement and extension of the data model and UIs from Survey (and its sibling forks) to support a variety of expanded user requirements
incorporation of hooks (of various sorts, such as Service Contracts) to integrate Assessment with OpenACS subsystems: Content Repository, Workflow, Notifications, Internationalization, etc

The measure of success of the Assessment package is the ease with which it can rapidly be deployed into some high-profile implementations, notably dotLRN and a clinical trials management system under development.

Use Cases

The assessment module in it's simplest form is a dynamic information gathering tool. This can be clearly seen in the first group of use cases, which deal with surveys (one form of assessment, e.g. for quality assurance or clinical trials). An extension of this information gathering the possibility to conduct an evaluation on the information given, as we show in the second group of use cases (testing scenarios). Last but not least, the assessment tool should be able to provide it's information gathering features to other packages within the OpenACS framework as well.

It is very important to note that not all parameters and features mentioned in this use case should be displayed to the user at all times. Depending on the use case, a good guess with pre determined parameters should be made for the user (e.g. no need to let the user fill out correct answers to questions, if the question is not used in a test). Some use cases like elections require special parameters not necessary anywhere else (like counting system).

Survey scenario

The survey scenarios are the basic use cases for the use of the assessment system.

Simple survey

An editor wants to conduct surveys on his site. For this purpose he creates questions which are stored in a question catalogue. From this question catalogue, the editor choose the questions he wants to use in his current survey along with the style the survey should be presented to the user. Once satisfied he can make the survey public or test it first. Once the survey is public subjects (users) of the site can take the survey by filling out the generated form with all the questions the author added to the survey.

Quality Assurance

A company wants to get feedback from users about it's product. It creates a survey which offers branching (to prevent users from filling out unnecessary data, e.g. if you answered you have never been to Europe the question "Have you seen Rome" should not show up) and multi-dimensional likert scales (To ask for the quality and importance of a part of the product in conjunction).

Professional data entry

A clinic wants to conduct a trial. For this research assistants are asked to interview the patients and store the answers in the assessment on behalf of the client. For meeting FDA requirements it is mandatory to prove exactly who created any datum, when, whether it is a correct value, whether anyone has looked at it or edited it and when along with other audit trails. As mistakes might happen, it is important that the system runs checks on the plausibility of the entered data and the validity of it (area code should be five digits, if the age of the patient is below 10, no need to ask for credit card information, ...).

University survey

A Professor wants to create a test by searching through the question database and selecting old questions. He searches the database for a specific keyword or browses by category. The System presents him all questions which have the keyword and/or category in it. The Professor is able to preview every question and may then decide which question he will transfer into the survey.

Internal Evaluation

An institution wants to survey students to compare the quality of specific courses, teachers, or other factors effecting the quality of their education and level of happiness. It should be possible for the person who takes the survey to submit the survey anonymously and only be able to take the survey once.

It should also be able to show the results of a survey to a group of users (e.g. a specific department evaluated). The results should be able to be displayed in a way that give a department a ranking compared with other departments.

Reuse of questions

The author of multiple choice question decides that the provided answers are not good for differentiating the knowledge of the subjects and changes some of them. All editors using this question should be informed and asked, if they want to use the changed version or the original one. If the decision is made to switch, it has to be guaranteed that a distinction between subjects that answered the original and the new version is kept. In addition the editor should be able to inform all subjects that have taken the question already, that it has changed (and that they might (have to) re-answer).

Multiple languages

The quality assurance team of the company mentioned above realizes that the majority of it's user base is not native English speakers. This is why they want to add additional translations to the questions to broaden the response base. For consistency, the assessment may only be shown to the subject if all questions used have been translated. Furthermore, it is necessary to store the language used along with the response (as a translation might not be as good as the original).

The poll

An editor wants to conduct a poll on the site with immediate publication of the result to get a feeling how users like the new design of the website. The result can be displayed in an includelet (see the below for details) on any page the editor wants.

The election

The OpenACS community wants to conduct a new election on the OCT. On creation the names of the contestants have to be available along with a list of all users allowed to vote. Depending on the election system, the users have one or multiple votes (ranked or not), which are calculated in a certain way. Once the election is over the result is published.

Collective Meeting planing

The sailing club needs to find meeting time for all skippers to attend. Given a number of predefined choices, each skipper can give his/her preference for the time slots. The slot with the highest approval wins and is automatically entered into the calendar of all skippers and a notification send out.

Testing scenario

Especially in the university environment it is important to be able to conduct tests. These help the students to prepare for exams but also allow Professors to conduct exams. In addition to the data collection done in a survey scenario testing adds checks and instant evaluation to assessment.

Proctored Exam

A Professor wants to have a proctored test in a computer room. He wants to create the test using question that he has added and are already in the database. The only people allowed to take the test are the people that have actually showed up in the room (e.g. restricting the exam to specific IP-subnet and/or an exam password which he will give the students in the room at the time of the test that gives them access to the exam). Additional security measures include:

Students have to submit the survey signed with their PGP key (which has been verified by the university) at the end.
Students have to print out their test and sign every page to make sure the answers in the system are identical to the ones the student has given.
In a purely multiple choice environment, the Test might be printed out on a sheet of paper for each user along with a return sheet which needs the answers to be ticked off. A scanner system scans this return sheet and stores the data for the student in the system.

The Mistake

A Professor has created a test from the question pool and have administered the exam to a group of students. The test has been taken by some of his students already. He discovers that the answer to one of the questions is not correct. He modifies the test and should be given the option to change the results of exams that have already been completed and the option to notify students who have taken the test and received a grade that their results have changed.

Discriminatory power

A Professor has created a test which is taken by all of his students. The test results should be matched with the individual results to create the discriminatory power and the reliability of the questions used in the test. The results should be stored in the question database and be accessible by every other professor which has the privileges to access the database of this professor.

[A Question improves the test in reliability if it differentiates in the context of the test. This is happening if it has discriminatory power. The Question has discriminatory power if it is splitting good from bad students within the question in the same way they passes the test as good and bad students. The discriminatory power tells the professor if the question matches the test. Example: A hard question with a high mean value should be answered by good students more often right than by bad students. If the questions is answered same often by good and bad students the discriminatory power tells the professor that the question is more to guess than to know]

The vocabulary test

A student wants to learn a new language. While attending the class, he enters the vocabulary for each section into the assessment system. If he wants to check his learned knowledge he takes the vocabulary test which will show him randomized words to be translated. Each word will have a ranking stating how probable it is for the word to show up in the test. With each correct answer the ranking goes down, with each wrong answer it goes up. Once a section has been finished and all words have been translated correctly, the student may proceed to the next section. Possible types of questions:

Free text translation of a word
Free text translation of a sentence
Multiple choice test
Fill in the blanks

To determine the correct answer it is possible to do a char-by-char compare and highlight the wrong parts vs. just displaying the wrong and correct answer (at the end of the test or once the answer is given).

The quizz

To pep up your website you offer a quiz, which allows users to answer some (multiple choice) questions and get the result immediately as a percentage score in a table comparing that score to other users. Users should be able to answer only a part of the possible questions each time. If the user is in the top 2%, offer him the contact address of "Mensa", other percentages should give encouraging text.

Scoring

The computer science department has a final exam for the students. The exam consists of 3 sections. The exam is passed, if the student achieves at least 50% total score. In addition the student has to achieve at least 40% in each of the sections. The first section is deemed more important, therefore, it gets a weight of 40%, the other two sections only 30% towards the total score. Each section consists of multiple questions that have a different weight (in percent) for the total score of the section. The sum of the weights has to be 100%, otherwise the author of the section gets a warning. Some of the questions are multiple choice questions, that get different percentages for each answer. As the computer science department wants to discourage students from giving wrong answers, some wrong answers have a negative percentage (thereby reducing the total score in the section).

Reuse in other packages

The information gathering capabilities of the assessment system should be able to be reused by other packages.

User profiling

In order to join a class at the university the student has to fill out some questions. The answers can be viewed by the administrator but also by other students (pending the choice of the user). This latter functionality should not be part of assessment itself, but of a different module, making use of assessment. The GPI user-register is a good example for this.

Includes

Using a CMS the editor wants to include the poll on the first page on the top right corner. The result should be shown on a separate page or be included in the CMS as well.

Information gathering for developers

A developer needs functionality for gathering dynamic information easily. For this he should be able to easily include an assessment instead of using ad_form directly in his code. This gives the administrator of the site the option to change the questions at a later stage (take the questions in the user sign-up process as an example).

Database questions

Some answers to questions should be stored directly in database tables of OpenACS in addition to the assessment system. This is e.g. useful if your questions ask for first_names and last_name. When answering the question, the user should see the value currently stored in the database as a default.

Action driven questions

The company conducting the QA wants to get more participants to it's survey by recommendation. For this each respondee is asked at the end of the survey if he would recommend this survey to other users (with the option to give the email address of these users). The answer will be processed and an email send out to all given emails inviting them to take the survey.

User Types

There are several types of administrative users and end-users for the Assessment package which drive the functional requirements. Here is a brief synopsis of their responsibilities in this package.

Package-level Administrator

Assigns permissions to other users for administrative roles.

Editor

Has permissions to create, edit, delete and organize in repositories Assessments, Sections and Items. This includes defining Item formats, configuring data validation and data integrity checks, configuring scoring mechanisms, defining sequencing/navigation parameters, etc.

Editors could thus be teachers in schools, principal investigators or biostatisticians in clinical trials, creative designers in advertising firms -- or OpenACS developers incorporating a bit of data collection machinery into another package.

Scheduler

Has permissions to assign, schedule or otherwise map a given Assessment or set of Assessments to a specific set of subjects, students or other data entry personnel. These actions potentially will involve interfacing with other Workflow management tools (e.g. an "Enrollment" package that would handle creation of new Parties (aka clinical trial subjects) in the database.

Schedulers could also be teachers, curriculum designers, site coordinators in clinical trials, etc.

Analyst

Has permissions to search, sort, review and download data collected via Assessments.

Analysts could be teachers, principals, principal investigators, biostatisticians, auditors, etc.

Subject

Has permissions to complete an Assessment providing her own responses or information. This would be a Student, for instance, completing a test in an educational setting, or a Patient completing a health-related quality-of-life instrument to track her health status. Subjects need appropriate UIs depending on Item formats and technological prowess of the Subject -- kiosk "one-question-at-a-time" formats, for example. May or may not get immediate feedback about data submitted.

Subjects could be students, consumers, or patients.

Data Entry Staff

Has permissions to create, edit and delete data for or about the "real" Subject. Needs UIs to speed the actions of this trained individual and support "save and resume" operations. Data entry procedures used by Staff must capture the identity if both the "real" subject and the Staff person entering the data -- for audit trails and other data security and authentication functions. Data entry staff need robust data validation and integrity checks with optional, immediate data verification steps and electronic signatures at final submission. (Many of the tight-sphinctered requirements for FDA submissions center around mechanisms encountered here: to prove exactly who created any datum, when, whether it is a correct value, whether anyone has looked at it or edited it and when, etc etc...)

Staff could be site coordinators in clinical trials, ensurance adjustors, accountants, tax preparation staff, etc.

System / Application Overview

Editing of Assessments

Manage the structure of Assessments -- the organization of series of questions (called "Items") into Sections (defined logically in terms of branch points and literally in terms of "Items presented together on a page"), along with all other parameters that define the nature and function of all Assessment components.
Create, edit and delete Assessments, the highest level in the structure hierarchy. Configure Assessment attributes:
- Assessment name, description, version notes, instructions, effective dates (start,stop), deployment status (development, testing, deployed, ended), whether it can be shared or cloned, associated logo, etc.
- The composition of an Assessment consisting of one or more Sections, or even other pre-made Assessments.
- The criteria that determine when a given Assessment is complete, derived from completion criteria rolled up from each constituent Section.
- Navigation criteria among Sections -- including default paths, randomized paths, rule-based branching paths responding to user-submitted data, and possibly looping paths.
- Whether the Assessment metadata (structure, composition, sequencing rules etc) can be altered after data collection has begun (scored Assessments may not make any sense if changed midway through use).
- Other measured parameters of how an Assessment gets performed -- total elapsed time, time per Section, time per Item
- Configuration of state transitions of an Assessment, depending on context of its deployment. For instance:
  - In education, an Assessment might be: Unbegun, Partially Begun, Submitted, Revised, Finally Submitted, Auto-graded, Final Manually Graded, Reviewed by Student, Reviewed by Student and Teacher Together
  - In clinical trials, the process is complex and dependent on whether "double entry" is needed; see this FSM diagram for an illustration.
- Scheduling: number of times user can perform Assessment; whether user can revise a completed Assessment; whether a user can interrupt and resume a given Assessment
- Control of access permissions for all components of the Assessment, including editing of the Assessment itself, access to collected Assessment data, and control of scheduling procedures.
- A "clear" button to wipe all user input from an Assessment.
- A "printer-friendly" version of the Assessment so that it can be printed out for contexts in which users need to complete it on paper and then staff people transcribe the answers into the web system (yes, this actually is an important feature).
Create, edit, clone and delete Sections -- the atomic grouping unit for Items. Configure Section attributes:
- Section names, descriptions, prompts (textual and graphical information), etc.
- The composition of Items in a Section.
- The formatting of Items in a Section -- vertical or horizontal orientation, grid patterns, column layouts, etc.
- The criteria that determine when a given Section is complete, derived from submitted data rolled up from the constituent Items.
- Item data integrity checks: rules for checking for expected relationships among data submitted from two or more Items. These define what are consistent and acceptable responses (ie if Item A is "zero" then Item B must be "zero" as well for example).
- Navigation criteria among Items within a Section -- including default paths, randomized paths, rule-based branching paths responding to user-submitted data, including possibly looping paths.
- Any time-based attributes (max time allowed for Section, minimum time allowed)
- A "clear" button to clear all user values in a Section.
Create, edit, clone and delete Items -- the individual "questions" themselves. Configure Item attributes:
- Item data types: integer, numeric, text, boolean, date, or uploaded file
- Item formats: radio buttons, checkboxes, textfields, textareas, selects, file boxes.
- Item values: the label, instructions, feedback text (for use during "grading") etc displayed with the Item either during the subject's performance of the Assessment or the.
- Item designation (a "field code") to include in data reporting
- Item defaults: configure a radio button choice that will be checked when the Assessment first displays, a text that will appear, a date that will be set, etc.
- Item data validation checks: correct data type; range checks for integer and numeric types; regexp matching for text types (eg accept only valid phone numbers) along with optional case-sensitivity during text validation; valid file formats for uploaded files. Note: the designation of "the correct answer" in the educational context of testing is a special case of data validation checks.
  Note also: need to support three-value logic regarding the existence of any single Item datum: null value means the Item hasn't been dealt with by responder; "unknown" value means that the Item has been answered but the responder doesn't know value; actual value (of proper type) means that the responder has found and submitted a value.
- Database-derived stock Items (eg, "country widgets", "state widgets", etc).
- Item-specific feedback: configurable text/sound/image that can be returned to user based on user response to Item.
- Any time-based attributes (max time allowed for Item, minimum time allowed).
- Support of combo-box "other" choice in multiple-choice Items (ie, if user selects a radiobutton or checkbox option of "other" then the textbox for typed entry gets read; if user doesn't select that choice, then the textbox is ignored).
- A "clear Item" button for each Item type that can't be directly edited by user.
Create, edit, clone and delete Item Choices -- the "multiple choices" for radiobutton and checkbox type Items:
- Choice data types: integer, numeric, text, boolean
- Choice formats: horizontal, vertical, grid
- Choice values: labels, instructions, numeric/text encoded values
- Choice-specific feedback: configurable text/sound/image that can be returned to user based on user response. -- either while subject is taking Assessment or later when subject is reviewing the "graded" Assessment.
Create, edit, clone and delete post-submission Assessment Processing Procedures. Configure:
- Scoring Algorithms: names and arithmetic calculation formulae to operate on submitted data when the form returns to the server. These include standard "percent correct -> letter grade" grading schemes as well as formal algorithms like Likert scoring (conversion of ordinal responses to 0-100 scale scores).
- Names and descriptions of Scales -- the output of Algorithm calculations.
- Mapping of Items (and/or other Scales) to calculate a given Scale Scores.
- Define data retrieval and display alternatives: tabular display in web page tables; tab-delimited (or CSV etc) formats; graphical displays (when appropriate).
- Note: manual "grading by the teacher" is a special case of post-submission Assessment Processing in that no automated processing occurs at all; rather, an admin user (the teacher) retrieves the subject's responses and interacts with the subject's data by in effect annotating it ("This answer is wrong" "You are half right here" etc). Such annotations could be via free text or via choices configured during editing of Items and Choices (as described above).
Note that there are at least three semantically distinct concepts of scoring, each of which the Assessment package should support and have varying levels of importance in different contexts. Consider:
- Questions may have a "correct" answer against which a subject's response should be compared, yielding some measure of a "score" for that question varying from completely "wrong" to completely "correct". The package should allow Editors to specify the nature of the scoring continuum for the question, whether it's a percentage scale ("Your response is 62% correct") or a nominal scale ("Your response is Spot-on" "Close but No Cigar" "How did you get into this class??")
- Raw responses to questions may be arithmetically compiled into some form of Scale, which is the real output of the Assessment. This is the case in the health-related quality-of-life measures demo'd here. There is no "correct" answer as such for any subject's responses, but all responses are combined and normalized into a 0-100 scale.
- Scoring may involve summary statistics over multiple responses (one subjects' over time; many subjects' at a single time; etc). Such "scoring" output from the Assessment package pertains to either of the two above notions. This is particularly important in educational settings.
Create, edit, clone and delete Repositories of Assessments, Sections and Items. Configure:
- Whether a Repository is shareable, and how/with whom.
- Whether a Repository is cloneable, and how/with whom.
- Note: this is the concept of a "Question Catalog" taken to its logical end -- catalogs of all the organizational components in an Assessment. In essence, the Assessment package is an Assessment Catalog. (The CR is our friend here ;-)
- Versioning is a central feature of this repository; multiple "live" versions of any entity should be supported, with attributes (name, version notes, version creation dates, version author, scope -- eg subsite/group/etc) to make it possible to identify, track and select which version of any entity an Assessment editor wants to use.

Scheduling of Assessments

Create, edit, clone and delete Assessment Schedules. Schedulers will define:
- Start and End Dates for an Assessment
- Number of times a Subject can perform the Assessment (1-n)
- Interval between Assessment completion if Subject can perform it more than once
- Whether anonymous Subjects are allowed
- Text of email to Subjects to Invite, Remind and Thank them for performing Assessment
- Text of email to Staff to Instruct, Remind and Thank them for performing Assessment on a Subject
Provide these additional functions:
- Support optional "electronic signatures" consisting simply of an additional password field on the form along with an "I attest this is my response" checkbox that the user completes on submission (rejected without the correct password) -- ie authentication only.
- Support optional "digital signatures" consisting of a hash of the user's submitted data, encrypted along with the user's password -- ie authentication + nonrepudiation.
- Perform daily scheduled procedures to look for Subjects and Staff who need to be Invited/Instructed or Reminded to participate.
- Incorporate procedures to send Thanks notifications upon completion of Assessment
- Provide UIs for Subjects and for Staff to show the status of the Assessments they're scheduled to perform -- eg a table that shows expected dates, actual completion dates, etc.

Analysis of Assessments

Provide UIs to:
- Define time-based, sortable searches of Assessment data (both primary/raw data and calculated Scored data) for tabular and (if appropriate) graphical display
- Define time-based, sortable searches of Assessment data for conversion into configurable file formats for download
- Define specific searches for display of data quality (incomplete assessments, audit trails of changed data values, etc)

Performance of Assessments

Provide mechanisms to:
- Handle user Login (for non-anonymous studies)
- Determine and display correct UI for type of user (eg kiosk format for patients; keyboard-centric UI for data entry Staff)
- Deliver Section forms to user
- Perform data validation and data integrity checks on form submission, and return any errors flagged within form
- Display confirmation page showing submitted data (if appropriate) along with "Edit this again" or "Yes, Save Data" buttons
- Display additional "electronic signature" field for password and "I certify these data" checkbox if indicated for Assessment
- Process sequence navigation rules based on submitted data and deliver next Section or terminate event as indicated
- Track elapsed time user spends on Assessment tasks -- answering a given question, a section of questions, or the entire Assessment -- and do something with this (we're not entirely sure yet what this should be -- merely record the elapsed time for subsequent analysis, reject over-time submissions, or even forcibly refresh a laggard user's page to "grab the Assessment back")
- Insert appropriate audit records for each data submission, if indicated for Assessment
- Handle indicated email notifications at end of Assessment (to Subject, Staff, Scheduler, or Editor)