Alex,
doing an exec in a process with a large footprint can be expensive (doubling the footprint in the worst case, but that depends on the used operating system). The footprint of nsd is the sum of the memory all nsd threads. To reduce the footprint, i would recommend to reduce the number of connection threads and use bg-delivery for large file deliveries
http://www.openacs.org/xowiki/en/Boost_you_application_performance_to_serve_large_files%21
Furthermore, i would recommend to switch to a recent version of tcl (such as 8.4.13). Several memory and thread related problems were fixed in tcl during the last four years.
Another option to address the fork problem is to use ns_proxy, which should be better in this respect.