Forum OpenACS Q&A: Re: How can I find the code that picks up packages from cvs?

Collapse
Posted by Gustaf Neumann on
Jim,

the code, which has to be reworked is the repository building code, which is part of every openacs installation. The code is typically deactivated at client sites, and only active at openacs.org. The proc [1] is run as a scheduled procedure an is responsible for rebuilding the .apm files from the openacs.org repository (e.g. when some site runs a "update/install from repository). The code is closely related to the cvs guidelines (who cvs tags have to be provided for branches and releases), which also need to be reworked, when we want to switch to git.

-g

[1] http://openacs.org/api-doc/proc-view?proc=apm_build_repository&source_p=1

Collapse
Posted by Jim Lynch on
Gustaf,

Thanks, found it right away. Once I start working on this code to produce a gittable version, I'll be putting it in a branch on my fork, and I'll announce on a new thread here where we can discuss it and look at it.

-Jim

Collapse
Posted by Jim Lynch on
One more thing...

Which branch should I be working on?

I guess master? I'll start with that assumption.

-Jim

Collapse
Posted by Gustaf Neumann on
cvs has as relevant branches HEAD and oacs-5-8. New features should to into HEAD. In git terminology "HEAD" is "master". ... however, i see that the latest changes (of the last 5 months) have not been copied over to HEAD. However, the apm-admin procs were not changed lately, so both versions as fine.
Collapse
Posted by Jim Lynch on
OK, to get the code (you get all of openacs-core), you do:

git clone https://github.com/jwlynch/openacs-core.git

and the branch for this is port-for-git-apm_build_repository

which is based on master

There is one commit, which comments out the repository builder proc so I can put it back a little at a time.

-Jim

what is the purpose of commenting out that code?
Collapse
Posted by Jim Lynch on
OK, I screwed up my setup, if you cloned my repo before this data:

Wed Apr 8 00:08:38 UTC 2015

you should reclone, and the branch name has also changed, now it's called port-git-apm_build_repository

sorry for any inconvenience this may cause.

-Jim

i don't get it, what he purpose of cloning a clone is, which is the same as the original but having a function commented out.

A useful contribution to the would be work on the preliminaries, such as: Currently the git structure is composed of 305 separate git repositories (if i counted correctly). In order to generate the .apm files for all packages in all release channels (oacs terminology), one has to clone all these repositories and bring it to an appropriate structure. This leads us to the "git packaging" question (modules and wrappers, branches, tagging).

Maybe one option is build a wrapper git module, which contains e.g. the better maintained 78 packages currently in the oacs-5-8 branch, using maybe git submodules [1]. This cloud be useful for setting up a small website with OpenACS common packages. But is this for such sites any better than using the acs-core + "install from repository"?

For installed sites, the 78 packages are just a partial help. We have e.g. on our learn@wu system 148 OpenACS packages, including most of the 78 packages above, but many of these are modified, and rest are own development. All of these packages are in git since 2008, with their own history, branches, tags etc. We want to use feature branches etc. I am pretty sure, other large installed sites have a similar situation. Since we have to rethink packaging with git, we should come up with some kind of recommendation, guidelines, development workflows.

What is the best way to setup the git repository structure for large installations? Are submodules the best way? Is this performance-wise good enough? What to do with cross-submodules commits, maintenance work (cross submodules tagging, branching) etc. Although submodules look from some distance perfectly suited for packages, they are in many respects inconvenient (see e.g. [2, 3]), alternatives are e.g. git subtree [4] or subrepo [5]. It would be usefule to evaluate the options to some depth to come up with a founded recommendation.

As for apm-generation: we need a way to get all packages from the 305 repos, which require a consistent tagging / branching structure to offer the right packages/package upgrades to site admins using different OpenACS releases. Is this feasible to enforce this in a highly decentralized way?

The big question is, what we do want to achieve.
What are the important use-cases?
- first install of OpenACS? scm does not matter.
- site admins: install from repo should be sufficient
- sites developing/contributing own packages: the workflow based on the package manager is a first step
- users who want to adapt the code: it would be nice to switch from a "tar checkout" to a "git checkout" via a click in the package manager and to contribute code back also via package manager.
- sites maintaining modified versions of core packages: this happens and will happen, but from our own experience i would be more happy, when improvement would go more often into the common source base than in local modifications
- regular openacs-development: this is my least concern.

Probably there are much more aspects to this.

-g

[1] http://bast.fr/talks/git/submodules/
[2] http://somethingsinistral.net/blog/git-submodules-are-probably-not-the-answer/
[3] https://codingkilledthecat.wordpress.com/2012/04/28/why-your-company-shouldnt-use-git-submodules/
[4] http://blogs.atlassian.com/2013/05/alternatives-to-git-submodule-git-subtree/
[5] https://github.com/ingydotnet/git-subrepo/

Collapse
Posted by Jim Lynch on
Oops, had to rewrite history on my repo again; couldn't be helped. So if you cloned before

Wed Apr 8 19:30:04 UTC 2015

then you'll have to reclone, and new branch is git-apm_build_repository.

-Jim

Collapse
Posted by Jim Lynch on
So, yes, we should come up with something new along these lines, and, we should have something soon; Lars' code has been working stable for a long time.

One question you may have, is if this code is completed and applied, should we just stop and accept that? No, as there are new and good ideas in the mix.

Why did I comment out most of the proc? Because I'm just starting, and having something that "does something" is good for testing and debugging.

305 repos. One way to go would be to take the packages dir of openacs-core, and split all the packages out, each into their own repo. That would make it more consistant. Openacs-core git repo should have an empty packages dir, then later if you still want openacs-core tarball, you can put that together by extracting each package from the repo, and putting the package files into a tarball staging area with the openacs-core infrastructure (with etc, packages and the others) and just tar it up.

Right now, I'm wondering what the channel names are, so I can feed them into package extraction and building, so they get their right version and channel.

Having heard there's no volunteer help for this (too bad, hope we can generate more), I thought I'd jump in and see what could be done, if only in the interrim. My feeling was, we need something, and soon (if not yesterday),

Also, I'm not sure if git submodules is the best thing, maybe git subtree is a better choice. More later, and, I think I need to learn more about how to do preliminary planning, I tend to see something and start generating code right away.

More later

-Jim

I can give a quick introduction how I structure my ]project-open[ Code, which include OpenACS obviously.

- There is the main repository which includes the directories of ACS_ROOT, but no packages.
- Packages are configured using a pkgs-list.txt file, which tells checkout-modules to get the packages as submodules
- I usually have a branch for the main repository for each client / installation. This way I can point to special commits in the packages, depending on the client.
- If I make changes to a package for a client, I usually create a quick branch for that feature in the package and point to it from the clients main repository branch. Once fully tested and in production I merge the branch back to master for the package.
- Whenever I work for a client on a package, I make a pull from master for that package to incorporate all the latest changes.
- Any change to a package which is client specific is a very bad idea and will result in myself kicking myself and moving this code into a client package, utilizing any of the methods for customization (Paramaters, Callbacks).
- There is an update-code.sh package which upgrades to the latest version for the client (e.g. If I released code from my dev system and want to update the demo system). This is actually a link from the admin pages.
- there is an addition update-submodules.sh code which updates my dev system to the latest version of the submodules.

This workflow pretty much covers what Gustaf mentioned under the first 5 bullet points.

In addition to that I maintain one installation where I have scripts that pull the remote code from the ]project-open[ CVS repository clone into my package repositories, something I dread once per quarter (or so). This basically is updating my (heavily changed) code base with the latest changes done in ]project-open[. This last part though is more of interest if you keep your own Cloned repositories from the OpenACS repositories if you don't have the ability to commit all your code back (for whatever the reason).

If you have questions on the workflow, let me know. The update-code / update-submodules would be the equivalent to "update from repository" which the APM packages try to do.

Malte, do you have the switching code and refetching code of following bullet point implemented in the package manager?

- users who want to adapt the code: it would be nice to switch from a "tar checkout" to a "git checkout" via a click in the package manager and to contribute code back also via package manager.

do you use just branches or as well tags? if one pulls via branches one might get work in progress rather than "releases".

is there any reason beyond historic ones that you us git submodules and not subtree?

Nope, instead i have a "update-code.sh" which is being called from a new link in the APM to update to the latest code of the master package. This would in turn update all submodules to the corresponding versions.

In theory though adding the link to check out the master for each submodule should be fairly easy, yet it isn't really the switch from tar checkout to git checkout that is mentioned above. It would require a git submodule to be in place.

I only use branches to be honest as I import the tags from CVS (the code which moves CVS code to git adds branches and TAGS) and I don't need additional tags for my workflow, especially if you work with submodules and point to actual commits for each package.

As for subtree, I actually started with subtree (and still have a version with subtree in place). Yet I ran into issues with custom packages with client specific code and the main repository which contains all the subtrees and keep that in sync with the various branches for development, testing and production. Ultimately I gave up and switched to submodules, never looking back. But I do have code which transforms ]project-open[ and OpenACS into ONE repository using subtree as well (if that is of interest).

Collapse
Posted by Andrew Piskorski on
Currently the git structure is composed of 305 separate git repositories
Why? Why not one single Git tree with everything underneath it? Is there something about Git that forces you to use 305 separate repositories?
Collapse
Posted by Malte Sussdorff on
Two reasons;
- ability to download only parts of the application modules and not all. After all you would not download all Wordpress plugins as part of one GIT repository either.
- apart from oacs-core that's how CVS is structured.

What you suggest would be the GIT subtree approach and maybe a mix and match is the correct answer (aka. GIT subtree for "Distributions" like OpenACS Core, dotLRN, various project open installations and the GIT sumbodules for anything installed in addition to the distribution.

Downside: a distribution can only be updated and branched as a whole, disallowing the independent package by package modifications or package based branches.

Collapse
Posted by Gustaf Neumann on

Downside: a distribution can only be updated and branched as a whole, disallowing the independent package by package modifications or package based branches.
not necessarily. with subtree, one has the option to pull/push to the whole tree, and/or to the components via "subtree pull/push".

A "large" single repository with the 300+ packages has certainly also advantages, since it is setup-wise quite simple, has no problems with cross-package commits, etc., ..., id does not conflict with site-specific workflows, etc., but requires that all these packages are in the same branch, have the same tags, etc. This is not compatible with e.g. the per-package (actually per-file) tagging of cvs, where different packages can be in different branches, and where within the branch, tags (like openacs-5-8-compat) can be used for flagging the state of a package such as releases or being part of different distributions (acs-core, dotlrn, etc.).