Forum OpenACS Q&A: Response to Data migration

Collapse
Posted by Andrew Grumet on
In addition to everything above, we've also contemplated an intermediate XML representation for purposes of content syndication and reuse (cf http://www.imsproject.org/). After talking to Michael B. about this and also given our time constraints, this approach is going to have to wait.

On the other hand, it seems like there could be significant value in separating out the process of getting data out of the old system from putting data into the new system. I've been pondering this a bit and think the following design pattern could be useful later without adding much overhead to our task:

  1. Create a separate database pool for the ACS3x system and call the pool "acs3x"

  2. Keep this pool out of the db API's hands by adding the following to our config file
    ns_section ns/server/${server}/acs/database
    ns_param AvailablePool pool1
    ns_param AvailablePool pool2
    ns_param AvailablePool pool3
    
    
  3. Push any content-fetching code out into a proc that returns a database cursor...
    
    set db [ns_db gethandle acs3x]
    
    # should return a cursor containing the following columns:
    #    email, first_names, last_name, screen_name, password,
    #    member_state
    
    set selection [migration_user_data $db]
    
    while { [ns_db getrow $db $selection] } {
        set_variables_after_query
        # do stuff with the data
    }
    
    ns_db flush $db
    
    
  4. Content-fetching procs look like this
    proc migration_user_data db {
    
        set sql "select ... from ... where ..."
    
        set selection [ns_db select $db $sql]
        return $selection
    
    }
    
  5. Migration teams customize the content-fetching procs only.
This is perhaps not quite as pretty as the .NET "DataSource" abstraction because it ties you to the ns_db API when your underlying data might actually be XML. On the other hand, it uses native AOLserver api call (== no overhead to us) and saves you from having to ram everything into memory first before doing the INSERTs.

We've actually come quite a way since my last post, and have draft scripts to import: users, dotllrn-users, communities, departments, terms, subjects, classes, subgroups, faqs, news. We're in the process of refining these scripts, adding new ones, and starting to tune various queries now that we have about 70,000 acs_objects in the system.

We're happy to work with anyone who wants to help, and to hand out copies of our buggy unfinished scripts ;) This being somewhat different than designing a new toolkit, I suspect folks won't feel too locked out if we wait until we're finished (late June at the latest) to release. But speak up if you'd like this effort to be more open.