Forum OpenACS Q&A: BBoard Vs Forums, i18n Vs SWS

Collapse
Posted by Ciaran De Buitlear on
We're trying to decide between using the forums package or the bboard package for our latest project. The substantial issues are:
  1. The bboard package is integrated with the content repository and the (oracle) site wise search while the forums package seems not to be.
  2. The forums package is internationalized and the bboard is not.

We need both i18n and sws with whichever discussion forum package we choose and will make the necessary changes. Can anyone offer any guidance?
Collapse
Posted by Peter Marklund on
I don't know too much about the bboard package. What's important to consider though is that all (or almost all) support and momentum is behind the forums package now. The forums package is used by openacs.org and by .LRN (MIT, Heidelberg etc.). Expect new feature development in this package. Also, openacs.org offers search of forums messages.
Collapse
Posted by Ciaran De Buitlear on
It seems like this search is on forums only and not a "site wide search".  Given Peter's comment would the best course of action be for us to change the forums package to avail of the content repository thereby making it compatible with the "site wide search"?
Collapse
Posted by Peter Marklund on
Ciaran,
that sounds like a good course of action to me. If I'm not mistaken OpenACS core team members such as Dan Wickstrom and Don Baccus have argued all along that the forums package should have used the content repository.

I can't really give a good estimate on how much work is involved in the change. If you decide to do this, you might want to look at the bug-tracker package for reference as we upgraded it to using the CR recently. The relevant upgrade script is:

bug-tracker/sql/postgresql/upgrade-0.9d1-1.2d2.sql

Don't be deterred by the size of that upgrade script as it did a lot more than just the CR upgade 😊

Collapse
Posted by Rocael Hernández Rizzardini on
Will be great to have the forums package integrated with the CR!. Also, I think will be useful if in the packages that are not supported anymore by the community, to have some kind of "warning" saying that the use of that specific package is "deprecated"? Where to put this message is the question?
Collapse
Posted by Don Baccus on
I deprecated bboard in the 4.6.2 tarball the old fashioned way - I took it out!

It would be great to have the forums package use the content repository.  Ben felt strongly that he didn't not want to use the content repository for forums and I didn't disagree with him, though I regret it now.  The main reason I didn't make it an issue was that I was trying my best to avoid making our already poor working relationship worse, not because I agreed technically.

But I do bear part of the responsibility as I did agree with Ben's decision to not use the content repository.

I don't think rewriting it to use the CR would be that difficult, nor would writing upgrade scripts be too burdensome.  In 4.6.2 I ripped out the special "fs_simple_objects" type that Open Force put into file-storage  to avoid using the content repository for URLs, and replaced it with the already existing non-versionable "external link" type in the CR.  The upgrade script wasn't that hard to write ...

Claran ... if you folks want to undertake this task for 5.0 I would be willing to help with advice, etc, but won't have time to do any significant coding on it in the near future.

Collapse
Posted by Ciaran De Buitlear on
We will almost certainly be doing this in the next few weeks. I'll keep you posted.  Thanks for your offer of advice etc.  - it will certainly be taken up...
...Our priority would be to get this working for new installs so we might not get time to do upgrade scripts.
Collapse
Posted by Peter Alberer on
Just wanted to ask, what would be the advantage of using the content-repository for the forums package? Integration into Sitewide-Search could be reached without the CR. Does anyone need revisions for forum posts?
Collapse
Posted by Ciaran De Buitlear on
Nice one Peter.  My mistake. I should spell that out - having looked at this for the last day or so I feel our priority is the site wide search. I don't personally think that the content-repository functionality as such is very useful for forum posts.  Sorry about the confusing origional post!
Collapse
Posted by Christof Spitz on
Will there be categories in the Forums package?
Collapse
Posted by Christof Spitz on
Will there be categories in the Forums package?
Collapse
Posted by Peter Marklund on
Christof,
we just finished the first version of the Logger application and I asked Timo to help us out adding categorization to that package. It took Timo about 20 minutes to do so. I think the same amount of work would be required for forums, although I think the thing we might be missing is a really nice category browsing widget. When I come to think about it Timo probably developed one and we would just have to use a new template.
Collapse
Posted by Ciaran De Buitlear on
As far as I know there are no categories in the forums package at the moment.  We won't be adding this functionality as such but we have plans to add an automatic "categorisation" of all content by using "oracle text" "themes" - these themes are visible in our extended site wide search.
Collapse
Posted by Ciaran De Buitlear on
I wonder what percentage of bboard (or any) content is actually categorised by the poster?  Are there any figures or anecdotal evidence available?  We are working on a large KM project for a goverment type body - our preferred solution for categorisation is a mentioned previously "Oracle text themes".  You can aquire, create and / or extend a Knowledge base (or taxonomy).  You can then get the system to assign a number of themes to each piece of content based on lexical analysis of the content.
Collapse
Posted by Malte Sussdorff on
If you talk about lexical analysis of the text, what do you have in mind to use?

What would be good is to ask users to categorize their content (let's say 20 items). From this point on an analysis tool could analyze the existing content and the new content, look at the categories and add it automatically to the appropriate categories in the category system. Or at least set the default to a certain value.

Concerning the browsing widget, the issue here is that you will have to touch all the queries in the packages to support a "and category_id ...." structure. So having a widget which lets' you limit the items in displayed in each package to only those which match a certain (and/or linked) categorization is not the big issue. It is the support within the packages that is giving the headaches.

If OTOH you mean, you want to browse the content by category and package, this is already in the categorization package. You are limited though to the title of the object and (soon?) to the description, which each package should be able to fill in.

Collapse
Posted by Ciaran De Buitlear on
I'm sorry if I haven't made this clear - we're using "Oracle Text" a indexing and advanced searching tool formerly known as Intermedia.

Here's a simplified example of kind of thing we did (I have more details if you want them):

create a "theme" table:
on qfs18 sqlplus enke_km/enke_km
create table mythemes
( query_id     number,
  theme     varchar2(2000),
  Weight     number
)
/

Create a Thesaurus index:
create index Thesaurus_idx on cr_revisions(title) indextype is ctxsys.context;

Generate themes from content (which bits of the taxonomy are relevant to some content):
/*  repeat for every piece of content
begin
ctx_doc.themes( index_name => 'Thesaurus_idx',
textkey => '714',
query_id => '714',
restab => 'mythemes'
);
end;
/

Look at the results:
select theme, weight from mythemes order by weight desc;
select query_id from mythemes order by weight desc;

I think we could do lots with this to improve searching and browsing of content:  Build taxonomy into search (done), link "topics of interest" or "skills" into it (under review), browsing of taxonomy (part of Oracle 9i), extending / adding taxonomies etc. (part of 9i).

We are currently converting OACS to use Oracle 9i as there is more advanced functionality available.

Here's the metalink entries all about it.
http://metalink.oracle.com/metalink/plsql/ml2_gui.startup

Collapse
Posted by Neophytos Demetriou on
Ciaran, have you had any thoughts for writing an "Oracle Text/Context/Intermedia" implementation (driver) for the search contract provided by the search package?

Currently, there's only an OpenFTS (PostgreSQL-based) implementation available.

Writing an implementation of the search engine contract for Intermedia would allow you (and others) to index/search content uniformly.

Collapse
Posted by Christof Spitz on
Ciaran,
we opened bboard-forums for study groups, and each ist studying a different main topic each semester. So categories help me to categorize announcements and subjects better  according to different subject matters and groups without having to create too many forums. Too many forums is a disadvantage because at last all study groups have to go through the same subjects sooner or later so it is better to build a common knowledge base and people are too lazy to browse a lot of forums.

So, although our forums are fairly new and we don't have much experience, I think we will make extended use of the categories. Since we use the forums for announcements and also sharing study materials to the students, many of the postings are done by ourselves, so we have it in our hands to use categories.

Also, we are using Postgres instead of Oracle, so "Oracle text themes" may not be available.

Collapse
Posted by Ciaran De Buitlear on
Hi,
  I'm not really trying to ctiticise "categories" - their application in the situation that Christof describes seems to be very appropriate i.e.  A small number of categories and a fairly  "tame" audience of failrly computer literate users who can be "encoiuraged" to comply with the system.  I'm really explaining why we're not particularly interested.  Our application is in the KM area and we'll have many many categories and sub categories and sub-sub categories etc.  fairly non-computer literate users and no real way to police the input of categories.  Hence the alternative approach.

If we add categories to forums I take it we could "switch off" that feature.

Collapse
Posted by Andrew Piskorski on
Malte, I have never used any lexical/text categorizing tools myself. Several I've heard of are the code discussed in Paul Graham's A Plan for Spam, CRM114, and bogofilter. And Graham lists many other open source Bayesian filters.

Most of those seem to have been used so far primarily or only as spam filters, but there have definitely been other applications (Extracting the interesting posts from Usenet, for example.)

Someone or other here also wrote a college thesis doing automatic classification of text in an OpenACS system, but I never read it and now I don't remember who that was or where it is.

There was also some old discussion of OpenCyc, which is of course quite different.

Collapse
Posted by Talli Somekh on
Whoa, I'm going to start using CRM114 just for the name.

"Captain, I'm reading Wing Attack Plan R over the CRM114..."

talli "purity of essense" somekh

Collapse
Posted by Håkan Ståby on
Rafael Calvo wrote in this thread about automatic categorization and he also wrote a paper:

"Williams K., R. A. Calvo and D. Bell. Automatic Categorization of Questions for a Mathematics Education Service. Artificial Intelligence in Education Conference. Sydney, Australia. July 2003"

It can be found here: http://www.weg.ee.usyd.edu.au/people/rafa/papers/

/Håkan

Collapse
Posted by Ciaran De Buitlear on
This discussion has taken legs and run off and that's a very good thing...
...We've decided that the best thing for us and our client is to  internationalise the bboard package which is already compatible with the site wide search.
Collapse
Posted by Christof Spitz on
Great!
Collapse
Posted by Christof Spitz on
Great!
Collapse
Posted by Christof Spitz on
Sorry, if my postings appear twice, it's not because I feel they have double importance 😉.

The thing is that sending the post is dead slow, for about 30sec. or so nothing happens, that's why I thought the line was cut or something and posted again.

It only happens after the last confirmation of the message. Is it because this thread is already quite long (<22 postings)?

Collapse
Posted by Don Baccus on
We haven't been able to track down why forums on openacs.org is so slow.  It's not true elsewhere on test data or live sites we're aware of.  There may be missing indexes or something like that ... no one's had time to track it down.  openacs.org is sort of a patchwork of older and newer snapshots of OpenACS 4.
Collapse
Posted by Christof Spitz on
In any case, thanks to all at openacs! Most important it works.
Collapse
Posted by Robin Felix on

This responds to the question requesting figures or anecdotal evidence concerning user-categorized forum posts. The following bboard categorization figures come from the current posts in a Marketing Forum on a major defense company intranet driven by an OpenACS 3.2.5 installation.

This forum has 10 categories, three of which are assigned by an outside administrator to track a timed process; the rest are self-selected by the users. If no category is selected for a topic and it does not need "finite state machine" control, it remains uncategorized.

Total topics: 832

  • Uncategorized: 18%
  • Categorized by users: 62%
  • Categorized by administrator: 21%