Forum OpenACS Q&A: Re: BBoard Vs Forums, i18n Vs SWS

Collapse
Posted by Malte Sussdorff on
If you talk about lexical analysis of the text, what do you have in mind to use?

What would be good is to ask users to categorize their content (let's say 20 items). From this point on an analysis tool could analyze the existing content and the new content, look at the categories and add it automatically to the appropriate categories in the category system. Or at least set the default to a certain value.

Concerning the browsing widget, the issue here is that you will have to touch all the queries in the packages to support a "and category_id ...." structure. So having a widget which lets' you limit the items in displayed in each package to only those which match a certain (and/or linked) categorization is not the big issue. It is the support within the packages that is giving the headaches.

If OTOH you mean, you want to browse the content by category and package, this is already in the categorization package. You are limited though to the title of the object and (soon?) to the description, which each package should be able to fill in.

Collapse
Posted by Andrew Piskorski on
Malte, I have never used any lexical/text categorizing tools myself. Several I've heard of are the code discussed in Paul Graham's A Plan for Spam, CRM114, and bogofilter. And Graham lists many other open source Bayesian filters.

Most of those seem to have been used so far primarily or only as spam filters, but there have definitely been other applications (Extracting the interesting posts from Usenet, for example.)

Someone or other here also wrote a college thesis doing automatic classification of text in an OpenACS system, but I never read it and now I don't remember who that was or where it is.

There was also some old discussion of OpenCyc, which is of course quite different.

Collapse
Posted by Håkan Ståby on
Rafael Calvo wrote in this thread about automatic categorization and he also wrote a paper:

"Williams K., R. A. Calvo and D. Bell. Automatic Categorization of Questions for a Mathematics Education Service. Artificial Intelligence in Education Conference. Sydney, Australia. July 2003"

It can be found here: http://www.weg.ee.usyd.edu.au/people/rafa/papers/

/Håkan