Forum OpenACS Q&A: Response to another regexp problem...

Collapse
Posted by Michael A. Cleverly on
Here's a regular expression that should work. You can shorten all the references of [[:space:]] to a backslash followed by an s. (I've noticed that backslashes in posts sometimes disappear, or have to be quoted to get past Postgres, and a missing definitely mess things up.)

set RE {<tr>[[:space:]]*<td colspan="3" class="idWether">Cityname:</td>[[:space:]]*</tr>[[:space:]]*<tr>[[:space:]]*<td colspan="3" class="idWetherField">(.*?)</td>[[:space:]]*</tr>[[:space:]]*<tr>[[:space:]]*<td colspan="3" class="idWether">Zip Code:</td>[[:space:]]*</tr>[[:space:]]*<tr>[[:space:]]*<td colspan="3" class="idWetherField">(.*?)</td>[[:space:]]*</tr>[[:space:]]*<tr>[[:space:]]*<td colspan="3" class="idWether">Degree Celcius:</td>[[:space:]]*</tr>[[:space:]]*<tr>[[:space:]]*<td colspan="3" class="idWetherField">(.*?)</td>[[:space:]]*</tr>[[:space:]]*}

Then you could process all the rows by doing:

foreach {match city zipcode celcius} [regexp -inline -all -nocase -- $RE $html] {
     # code goes here ...
}
This is Tcl 8 specific; won't work under nsd76.