Bulk Upload
Status
pre-release (part of the ecommerce-g2 project)
Introduction
There is little consistency between packages for uploading data. Most packages do not offer an import feature. The current practice is to use the database import features directly. This also means creating small scripts or using intermediate applications to manipulate the data prior to import, and using shell commands to get the data into the right security context and location. Mistakes in the import usually have to be corrected with hand-written sql updates. A data importer likely needs to have admin level access and unix level competency, which is beyond the usual skill set required to manage a website application. A bulk-upload package would standardize and ease barriers to setting up and managing OpenACS packages.
Vision Statement
A package that provides standardized UI and import services for bulk uploading data (tables and lists) into existing tables of other packages
Requirements
- provide import modes
- insert rows with new keys only
- update where rows already exist
- report any unchanged rows?
- ignore or error if importing a field not found in tables or specified sets
- import to multiple tables from the same upload
- option to print or log ignored or rejected rows/fields
- use the status_bar feature, that tells when a file is uploaded, and when it is being processed (helps in cases where connection times out)
- Use this framework to create a UI for managing ACS reference data ( en:acs-reference ) --importing, updating, editing, reporting. (for en:ecommerce-g2 )
Implementation Notes
see also: https://web.archive.org/web/20071007211200/http://jongriffin.com/articles/openacs-relevancy/
Exsiting ecommerce bulk upload of custom fields represents the most flexible at this point, because it handles inserts and updates, and builds the sql as well as uses the xql query files: https://raw.githubusercontent.com/openacs/ecommerce/master/www/admin/products/extras-upload-2.tcl
Perhaps a dumbed-down spreadsheet data model would work well, to allow for tables of most any number of columns or rows without having to create/expose code that modifies db tables.
DAVEB: In addition to any generic storage of imported data, I think it would be useful to support importing into specific storage tables. The easiest, best examples are the content repository, or dynamic types defined tables and views. There are Tcl apis to insert/update these tables and it makes sense to support this as well.
Don Baccus has other implementations which complement this and may provide an overall development path for a preliminary general bulk upload.
Feature requests