Forum OpenACS Q&A: Howto: PHRASE and Word Parse

Collapse
Posted by MaineBob OConnor on

For my simple search routine:
Ok, I am currently parsing words in TCL like this:

set word_list [split [string tolower $find_txt] " "]
set n_words [llength $word_list]
set i 1
foreach this_word $word_list {
       set word$i $this_word
	   incr i
}

I would like to be able to parse phrases AND words so that given the string:

    The Quick "Brown Fox" Jumps

AND surrounding phrases with "quotes"

I get FOUR words/or/phrases:

  • The
  • Quick
  • Brown Fox
  • Jumps

I am looking for some simple TCL code to do this, perhaps with some elegant regsub. A BONUS would be 2 phrases in the same string. (4 quote marks) with handling (ignoring) of 1, 3 or 5 quote marks.

THANKS in Advance -Bob

Collapse
Posted by Jonathan Ellis on
this is what I use:

-- escape out all \,$,[,],; with backslash
-- then use eval list instead of split to preserve quoted terms
regsub -all {[\\\$\[\];]} $1 {\\&} search_string
set search_words [eval list $search_string]

this does NOT handle odd numbers of " at ALL; it will error out.  but it does group the evenly quoted args as desired.