I had to work HARD, caching-wise, to get the nsd cpu usage up to where now it's maybe 20-30% of the total. This is with gzip compression on. I'd guess unless you're serving much more image-intensive pages, you don't really need to buy more boxes for the nsd end. But since you don't say what kind of cpu usage you're seeing currently, that's just a wild guess. :0
My datapoint: carnageblender serves about 250k pages on a good day. I've seen it do 300k w/o breaking a sweat on my dual-1GHz p3 box with 1GB RAM. I would bet the current setup could handle at least twice what it currently does. Past that, I could get a few more percent from caching less-important stuff, but to get past 1M I'd need better disks. I've already split the WAL to its own disk, which helps, but let's face it, 5400 RPM ide disks aren't where it's at. :)
My feeling is if you can optimize to where you can stay on a single (possibly bigger) box, it will be worth it to avoid the admin headache of multiple points of failure.