Forum OpenACS Development: A status report on OpenACS release management

For those who might be interested in the sausage-making, here's a summary of what I do to get releases out. I work from this document, which I update as I find errors in the process.
  1. Update Translations. This is cumbersome as it requires upgrading the translation server. Since OpenACS/.LRN upgrades are not 100% bulletproof, I always do it on a backup server. A full two-way synchronization of the translations on translate.openacs.org with the cvs translations takes several hours, because the code is quite slow. Because of these factors, I only update translations where there is a request pending, or when we do a bigger release. As you can see, in the 5.1.x series, .1, ,2, and .5 had translations but .0, .3, and .4 didn't.

    Future: Make successful upgrade of this server a prerequisite for OpenACS and .LRN releases. Provide an automated upgrade server for it so that anyone whose code breaks the upgrade is alerted within a day and held responsible.

  2. Rebuild the Changelog. This is now quite easy. It always surprises me how many changes there are for core on the stable branch, and in fact I am concerned that we are doing too much functionality work on the stable release branch. We should be doing basically zero, because each bit of functionality, while innocuous in isolation, adds risk, and by the time we have 20 functional changes, it seems very likely that the dot release will break somebody's upgrade, somewhere. This is contrary to the purpose of a dot release, which is to be a safe security/bugfix release. Future: Once we have more frequent minor releases, crack down on new functionality on release branches. This is not practical at present because core code put into the trunk sits there for a year or more before release. This is almost as bad as Microsoft.
  3. Update Version Numbers. Jeff tried to automate this for me but it's still easier to manually update the simple perl script and then manually correct anything that missed. It takes only a few minutes to do it this way. This is also where I catch any DocBook errors that other people have introduced by checking in xml and not trying to generate the HTML. This is easy to fix and so I would rather people continue to fix doc errors directly in the xml, even if they make DocBook errors. (Thanks in particular to Malte for doing this, and to a few other people whose identities I've forgotten).
  4. Tag the files in CVS. Figuring out what and why and when to tag took a few years, but doing the tagging takes a few minutes.
  5. Make the tarball(s). Also simple and risk-free.
  6. Test the new tarball(s). I now tend to do this earlier, at the end of Step 3. After I have updated the docs and version numbers, I install the site using pure defaults (ie, as service0 and without changing the config files) and run automated testing. If it installs and passes the automated tests, it's done as far as I'm concerned. I no longer repeat this with the tarballs because I'm reasonably confident that I can now tag, retrieve, and tar files reliably. Future: We need a lot more automated tests, including web tests with tclwebtest or something else, before this cycle will provide any more quality control than a trivial sanity check.
Collapse
Posted by Nima Mazloumi on
How difficult would it be to offer nightly builds of head and main branches?
Collapse
Posted by Jeff Davis on
I by nightly build you mean a nightly cvs checkout, then not hard, if you mean install head and 5.x nightly as a test, then hard.
Collapse
Posted by Nima Mazloumi on
Jeff, what exactly makes the latter hard?
Collapse
Posted by Jeff Davis on
Nima, automated things running out of cvs break and eat up time to fix. See joels testing post where he mentions:
One thing I learned from trying, semi-successfully, to maintain a test server at Collaboraid is that it is a daily job. Despite the automation, various things creep through in a daily automated rebuild that bring sites down for different reasons and sneak past any alarm detection you might have. Peter and Lars and I built a framework for distributed, automated checking, so that we can have a simple central dashboard even if some of the sites are hosted on different machines, and we need to get that working again. So this task is to manually check all test servers every day (over time this could be changed to alert-based checking), troubleshoot any problems, assign tasks to get problems fixed, and restore the test servers to functionality. Every day.