OpenCyc can be considered to be an enhancement to the
existing search package for most uses, as a pre-processer.
One major challenge in full-text searching is query
contextualization. Basically, when someone enters a general
search request for a topic like "Java," the search engine doesn't
know if you're interested in the programming language, coffee, or
an Indonesian island. OpenCyc (and WordNet, which is a part of
the package), can help that contextualization, in part by a)
engaging the requestor in a dialog to ask them which one they
are interested in, and/or b) chunking the return results into these
parent categories.
One way to use query contextualization and chunking is for the
preprocessor (OpenCyc) to formulate queries that will return a
separate results set for each sense of a word or phrase, a form
of metasearch, if you will. The results are aggregated and
presented to the requestor using some presentation method
that supports i) serendipity and/or ii) helping the requestor
eliminate from consideration results which are obviously not of
interest. General terms return so many results (java returns 2+
million pages at Google while java+coffee returns 190k pages)
enabling users to say, "I am not interested in these things," is
incredibly useful.
Because OpenCyc "knows" about these aspects of language, it
can examine texts to determine what they are "about." In this way
OpenCyc can be used extract meaning from a document/db
record (e-mail, web page, forum posting, etc.) and automatically
generate metadata that can be inserted in a db in a form that is
optimized for querying or query assistance. In one use I have,
OpenCyc would scan the contents of e-mails directed to
customer support and route them based on content.
OpenCyc can be used for general knowledge discovery in a
e-doc collection, and it can be retrospective in addition to current.
That is, I can run OpenCyc against an e-doc collection and let it
generate metadata about the collection, which metadata gets
indexed. I can imagine having a historical reporting function that
included trending to let me know the change in frequency of
occurrence of certain topics. Important in a CRM context.
Among other things.