Forum OpenACS Q&A: Response to grabbing text from a website...

Collapse
Posted by MaineBob OConnor on

Hi David,

in an openacs 3.x install at:

http://www.ercmembers.net/doc/proc-one.tcl?proc_name=util%5fstriphtml

Is this proc:
util_striphtml html

What it does:

Returns a best-guess plain text version of an HTML fragment. Better than ns_striphtml because it doesn't replace & g t ; and & l t ; with empty string.
Defined in: /web/nsaerc/tcl/ad-utilities.tcl.preload

Source code:
  return [util_expand_entities [util_remove_html_tags $html]]

AND
There are other procs you might find useful at one of my sites:

http://www.ercmembers.net/doc/proc-search.tcl?query_string=html

The above link to /doc/... has been broken for a long time in 'this here' openacs.org 😟

http://www.openacs.org/doc/proc-search.tcl?query_string=html

-Bob