Forum OpenACS Q&A: another regexp problem...

Collapse
Posted by David Kuczek on
I want to regexp the wether data values for "Cityname", "Zip Code" and
"Degree Celcius" out of the following html code...

I just couldn't make it. (I thought it would be easier...)

<table cellpadding="2" cellspacing="0" border="0" width="100%"
class="register">

<tr>

  <td colspan="3" class="idWether">Cityname:</td>

</tr>

<tr>

  <td colspan="3" class="idWetherField">$cityname</td>

</tr>

<tr>

  <td colspan="3" class="idWether">Zip Code:</td>

</tr>

<tr>

  <td colspan="3" class="idWetherField">$zip_code</td>

</tr>

<tr>

  <td colspan="3" class="idWether">Degree Celcius:</td>

</tr>

<tr>

  <td colspan="3" class="idWetherField">$degree_celcius</td>

</tr>

</table>

Muchos Gracias

Collapse
Posted by Michael A. Cleverly on
Here's a regular expression that should work. You can shorten all the references of [[:space:]] to a backslash followed by an s. (I've noticed that backslashes in posts sometimes disappear, or have to be quoted to get past Postgres, and a missing definitely mess things up.)

set RE {<tr>[[:space:]]*<td colspan="3" class="idWether">Cityname:</td>[[:space:]]*</tr>[[:space:]]*<tr>[[:space:]]*<td colspan="3" class="idWetherField">(.*?)</td>[[:space:]]*</tr>[[:space:]]*<tr>[[:space:]]*<td colspan="3" class="idWether">Zip Code:</td>[[:space:]]*</tr>[[:space:]]*<tr>[[:space:]]*<td colspan="3" class="idWetherField">(.*?)</td>[[:space:]]*</tr>[[:space:]]*<tr>[[:space:]]*<td colspan="3" class="idWether">Degree Celcius:</td>[[:space:]]*</tr>[[:space:]]*<tr>[[:space:]]*<td colspan="3" class="idWetherField">(.*?)</td>[[:space:]]*</tr>[[:space:]]*}

Then you could process all the rows by doing:

foreach {match city zipcode celcius} [regexp -inline -all -nocase -- $RE $html] {
     # code goes here ...
}
This is Tcl 8 specific; won't work under nsd76.