Forum OpenACS Q&A: Photo album for huge image library?

Collapse
Posted by Keith Paskett on
I need to put together an image library that will hold tens of thousands of images. I've used Photo Album with OACS 4.6 but haven't yet been successful and gettin the new photo album running.

I will need additional properties for each image -- classifications like people, hardware, buildings, etc., if there are people who are they, and on and on. Then I need a good way to search based on those properties. Does anyone have  experience/opinion on extending and scaling photo album for something like this?

Since the images aren't inherently organized by album, it seems like a core image library package with the photo album package being a organization/presentation layer built on the image library package would be the way to go.

I will share whatever I come up with but I tend to keep use of the content repository and the object system to a bare minimum so I can more easily move a package and it's contents from one site to another.

Collapse
Posted by Talli Somekh on
Keith, why aren't using the CR again? The problems you are facing - categorization, metadata archiving, search facilitation, etc - are the ones that the CR is best at solving.

If you choose to build around them then you will most likely end up building an ad hoc solution that approximates what's available in the CR.

talli

Collapse
Posted by russ m on
Keith - having built and supported a digital asset management package for stock photo libraries that *didn't* use the CR, let me give you one piece of advice.

Use the CR.

If you're using the OACS object model at all then your objects won't be portable from installation to installation without being exported/imported anyway, and the amount of stuff you'll spend time re-implementing will blow your mind.

Collapse
Posted by Keith Paskett on
I'm just moving from ACS 4.2 where my experience has been that after a few thousand objects the permission system becomes unacceptably slow for some very important (to me) types of queries. I have an ACS 4.2 site with about 800 groups and 300 users. It takes 30 seconds to get a list of groups that a user has permissions on. I do see the wisdom and logic in OpenACS's design. It is a brilliant piece of work and getting better (daily it seems). However, I have not always found it (in ACS 4.2 at least) to be practical to implement packages as originally intended, so I use it in unconventional ways and get the job done quite well.

So, back to my original question with a little clarificationa and extension. Does anyone have experience scaling Photo Album (or any package based on the CR I guess) to tens of thousands of objects? What if any impact does the number of objects in the CR have on the permissions system? Is it possible to have objects that do not inherit permission and thus grow the object-party privilege map view?

Collapse
Posted by Talli Somekh on
Ah, ok, that makes sense why you're wary of the problem.

Those problems have been worked out in OACS4.5+. AFAIK, you needn't worry about the permissioning system bogging down due to too many objects.

In addition, the work we're doing on integrating WebDAV with the OpenACS heavily leverages the CR. It sounds like you might want to take advantage of that functionality.

As far as your particular questions regarding scalability of the photo album, I'm not the one that can answer you...

talli

Collapse
Posted by Dave Bauer on
The way permissions storage and queries are handled in completely different since OpenACS 4.6.1. It is not comparable to ACS 4.2 at all. Take a look at this: https://openacs.org/projects/openacs/proposals/scalability/permissions

You could save months of work implementing your image repository using the OpenACS content repository. Greenpeace International uses a content-repository based system to manage the image library on their web site, although I am not sure how many images are in there.

Besides that, I think you might find some other interested parties to building such a system for OpenACS.

Collapse
Posted by Tom Jackson on

It think it entirely depends on exactly what you want to do. You may be able to get what you want quickly without the CR. Since no one but you knows what you want to do, answers will just be vague conjecture. Since I haven't seen any sophisticated examples using the CR, I assume it is either difficult to use or it doesn't satisfy the needs of developers. Also the CR doesn't do much to satisfy your desire to create a portable export of the data. I would think this would be the first task, and then figure out how to translate it into the database and back out again.

Collapse
Posted by Don Baccus on
Tom - define "sophisticated" then we can talk as to whether or not there are "sophisticated" uses of the CR.

Certainly there are content sites built using the CR that are more complex than any I've seen not using the CR ...

Many of our development community are happily using the CR so I find your comments somewhat amazing, to be honest.

Collapse
Posted by Don Baccus on
If Keith wants his work to end up as part of our standard toolkit, or even as a respectible package within our contrib library, he's going to have to write it using standard tools and paradigms.  I'm extremely tired of folks reinventing the wheel over and over again to do common tasks, and as I have said over and over and over again we need to *reduce* the number of ways common tasks are handled in the toolkit rather than add to them.
Collapse
Posted by Tom Jackson on

Keith hasn't even defined the problem completely and everyone chants 'CR'. This is total 'BS'. A standard paradigm is to extend the acs_objects table, more packages follow that paradigm than use the CR. If you don't want revisions hanging around for every object you create, or you don't want to do a few extra joins to get you data back out again, why do you have to do so?

Keith's stated concern is portability. Instead of chanting 'CR' at him, why not step up and show how portability can be achieved, and what the CR can do to enhance his project?

If we want standard tools, they should apply to acs_objects, not items in the CR. This is a general tool, this is a standard paradigm. Then every time we add a tool it applies to everything, because IMNSHO everything in the database is content.

But before something can be called a standard paradigm, it has to be shown it can easily solve problems that are common to the area. So far the examples I have seen, and the amount of work that appears to be required to create them, leads me to believe that using the CR is difficult.

Recreating the functionality in your own package will also be time consuming, so if you need most of the features of the CR you might decide it is worth figuring out how to use it.

Collapse
Posted by Talli Somekh on
So Tom... does this mean you've never tried to build on the CR?

And Keith's stated problems, as I read it, was portability *and* needing to have metadata, searchability, indexing, etc - all things the CR does very well.

The fact that you haven't seen an application being built on the CR doesn't mean it doesn't exist. It just means either those projects aren't open or you haven't built them.

Also, no one I have spoken to said the CR is difficult. Just that the docs suck.

talli

Collapse
Posted by Dave Bauer on
Tom,

You are correct that right now, the existing packages in OpenACS are not the best examples of how to use the content repository. Of course, most of them work, so there isn't as much incentive to rewrite them just for fun.

It is also a good idea to offer general features for acs_objects if that makes sense. One area this has been addressed for 5.0 is categorization, which was previously implemented only for content repository objects.

The greatest issue with the content repository is the lack of a full Tcl api to reduce the huge amount of pl/sql code that needs to be written. The old aD CMS package, which is greatly unappreciated, has some good ideas on how to automate many of the things you are talking about. The user interface is far from ideal so it doesn't get used much. It is more of a content repository browser than a good example of a end-user application.

Lately the outward appearance of OpenACS has been improving. I think we can also improve the Tcl APIs for developers to allow for easier development.

Collapse
Posted by Janine Ohmer on
Talli, I've obviously been remiss in leaving you out of my CR griping. :)

Seriously, every time I've had to work on code that uses the CR I've come to dislike it even more.  I agree that the functionality it provides is desireable, but this particular implementation definitely does not fall into the sweet spot between functionality and usability as far as I am concerned (YMMV).  Better documentation of what's there would definitely help, but it's not going to eliminate the inconsistent patchwork of helper procs, the need for extra joins even for things that don't need to be versioned, the nightmare (IMHO) of debugging code that is partly auto-generated, etc etc.

To me, the entire package feels like it was overly ambitious at the start, was incomplete when aD dropped it, and then has been hacked on periodically to provide the functionality that various people needed.

If I was writing a new package, the only reason I would use the CR would be if I cared about my package being adopted by OpenACS.  If I didn't mind it languishing in the dungeon of contrib, I would avoid the CR like the plague.  Maybe I'd even take a stab at writing CR2, which might actually end up being usable.

Just my $0.02.  I know *very* well that not everyone agrees with me. ;)

Collapse
Posted by Jun Yamog on
Hi Keith,

I guess you will have to decide for yourself which one to use. Here is what I can share about minus and plus of CR.

- the documentation is not that good.  But today unlike 2 years ago there are some apps already to study and see how they use the CR.  You can start on file storage, bug tracker (i think) and CMS.  I have a ongoing & unfinished package on contrib (bcms*).

- If you will be developing something near CR like features then might as well help us improve it.  Or just use the CR in some other way maybe some weaknesses or bugs may come out.

- portability I believe it not good.  There are exports and imports but I think they do not run.  I have never used them. Although it is encourage that you can contribute this weakness of CR.  This should be a little easy if the structure of your content is file system like (e.g. folders and files).

- search is also poor.  But its more of a OACS problem rather than CR.  But CR is a little easier since some of the service contract are already implemented.  So its like poor but better... weird :)

- permission scalability.  Its not really CR but more of the general permission.  So an improvement of permission should directly improve CR permission scalability.

- on the start its a bit hard to learn CR but once you are used to it.  Things are a little better.  Pretty much like what is OACS.  On the start is a bit difficult, once you get used to it.  Its ok.

I hope this helps you in your decision making.

All,

The above are just my opinion it may not be 100% fact.  I would like to avoid to have a useless discussion about I like CR and I don't like CR.

Collapse
Posted by Keith Paskett on
I suspect that good documentation on the CR is more the problem than it being difficult. I've found that to be the case with most things I use in ACS.

When I started learning Unix many years ago, I though every Unix expert had the opionion that "I learned it the hard way and so will you". Now I think it was/is more an "I know how to do that so why should I take time writing it down" attitude. Unfortunately that is the way I tend to work to often.

So what's the fastest way for me to figure out how to use the CR and what the specific benefits will be? Are the docs up-to-date, or do I search for threads in the forums?

Collapse
Posted by Tom Jackson on

With all these happy campers, you would think a few would have released their ideas to the OpenACS community so we could learn from them. Until someone does, why should we be so insistent that Keith follow all the rules and release his code. Are we trying to help Keith, or ourselves here? Just repeating 'CR' doesn't do much to help anyone. Anyway, instead of just saying I am wrong, just point to working examples somewhere that proves I'm totally mistaken. It wouldn't be the first time. Please include with the examples a time estimate: how long did it take to write the package? The fact that it can be done, or that this or that feature is supported is completely meaningless without knowing how much work went in to writing the application, how fragile the code is, how easy it is to modify or understand.

Maybe if a tutorial covered re-implimenting the notes example in the CR and showing all the benefits, how it saves time, etc. that would be a better way of letting someone decide if the CR is good for their application.

Btw, just because something is difficult to use or figure out doesn't mean you shouldn't use it. And once you know it the decision to use it or not for future work becomes easier to make. The biggest benefit of the CR is that it is integrated with OpenACS.

Collapse
Posted by Rocael Hernández Rizzardini on
In my opinion CR is a quite good tool, and having a central service to handle all the content in the way that CR does is just good, as Talli stated, the only missing thing is the docs, which is not so bad at all, but if you want exploit all what CR gives you, you'll need to investigate (which BTW is what you usually do in a fast improvement project like oacs). Actually, just seeing how other packages interacts with CR, is in the most cases is enough to build your application, and won't take you a lot of time to understand if you are used to work with oacs. About portability, anyone can create its own schema for it or try to apply *some* standard, so your work can be used by others ...

About scalability, if you can do some scalability tests before you use it on production will be good. Keith, are you using PG or oracle? (I guess oracle)
Anyway, why you don't create some simple scripts to populate the DB (one photo album instance), won't be so hard and will give you a good look about performance in extreme conditions.
check this:
http://xarg.net/writing/tuning/forums-scale
or
http://viaro.net/tips.htm#scalability

Collapse
Posted by Talli Somekh on
Janine, I also know how you work and it tends to be on the conservative side, which is understandable.

But those gripes are also no good within the context of making the system better, particularly when the choice is to avoid the system completely rather than improve it. The CR can be a very strong aspect of the OpenACS architecture but gets far too much pushback and neglect from those who choose to write ad hoc implementations over and over and over...

talli

Collapse
Posted by Rocael Hernández Rizzardini on
I completely agree with Talli, if we want an stronger architecture, will be better to take some time to improve something that maybe from someone perspective is not too good instead of doing semi-implementation of the same thing again. Anyway, it always depends on your time-frame, but if you are betting for the future on your tool, reserve a time for improvement of other related tools, in this case, CR.
Collapse
Posted by Carl Robert Blesius on
Keith, Institutions are starting to trickle into the OpenACS world via .LRN and storing LOTS of objects with metadata, categorizing, IMPORT and EXPORT is important for all of them. I agree with Tom in that the biggest benefit of the CR is its integration with OpenACS. Any shortcomings of the CR will be addressed eventually. I am interested to hear what you decide on this. Lots of people using it will help identify problems more quickly and result in more timely solutions (it also seems the CR has enough fans that you will probably get help if you ask :).
Collapse
Posted by Bart Teeuwisse on
Be careful which packages you study to learn how to use the CR. The current photo package is a good example of how NOT to use the CR.

For example, it abuses the live_revision link between cr_item and cr_revision to hide or publish photos. (And this method is broken to boot!).

Bad examples like this make the lack of good documentation even worse. My guess is that Janine's struggles are in part due to this compounded problem. The photo package isn't the only package that uses the CR incorrectly.

/Bart

Collapse
Posted by Randy O'Meara on
This is yet another CM thread that, when responses are weighed, tends to lean toward the negative. There are others hidden about in these forums.

I've watched as folks waded into the oacs CM stream over the last two years. Most, it appears, have been scared away by claims of "DANGEROUS WATERS". I'm just wondering why there is so little effort given to producing concrete examples and guidance that would prove the warnings false. Well, actually, I know why this is the case... the guys (and gals) that have taken the time to dig in and understand (the ones that are qualified to dispel the fears) are very much in demand for these exact skills.

I remember not too long ago when Joel revamped/rewrote the Getting Started tutorial that helped some of the basic oacs pieces fall into place for me. And, I'm sure it has helped countless others as well.

Wouldn't it be wonderful if, as  Tom suggested above, there were a simple tutorial (including working code samples) that would help beginning CR users understand the basics? The pifalls? I don't think it would be extremely time consuming for any individual CR expert if each of those knowledgeable of the CR added a small piece to the tutorial.

We do have a CMS forum that's gathering dust. How about a thread in that forum where bits and pieces could be added? If you have coded with the CR, would you post?

If so, just add a message to the thread (https://openacs.org/forums/message-view?message_id=136094) I just created for this purpose!

Now is your opportunity to give a little an receive a lot.

Collapse
Posted by Malte Sussdorff on
Wouldn't it be a good idea for starters to have a page listing all packages currently using the CR, give a recommendation of good vs. bad code, and describe how and why it makes sense to use the CR in this package.

Furthermore, could the supporters write a "Why and when to use the CR" (e.g. by posting in the thread mentioned above), whereas the critics raise their points and provide suggestions on how to improve it.

The CR is a core issue that will haunt us quite some time, if not tackled correctly. We should be able to give clear guidelines when to use the CR and the OCT should come to an agreement at some point, on whether or not to make the use of the CR mandatory for inclusion in the main tree.

Furthermore, there are a lot of ideas already for extending / enhancing acs_objects. I just refer to the acs_named_objects discussion. My question would be, taking into account that there are quite a lot of packages storing content in acs_objects or their own tables, shall we put more functionality to acs_objects to make it available to these packages as well?

A lot of discussion has already happened and I'm pretty sure the old OCT has some ideas on this issue. I'd love to see the new OCT come up with a clear guideline on this and means to encourage people to follow them (and I'm talking carrots here). Randy's effort goes into the right direction, thanks.