I've set a few web-crawler/search-engines loose on my company's Intranet and was quite disappointed with both ht://dig and UDMsearch - they are both VERY slow at lookups IF you can even complete the crawl.
I ended up writing my own crawler using Pavuk to crawl and Swish++ to index/search. I've thrown a lot at Swish++ and it's the most impressive (speed, stability, overhead) indexing/search engine I've ever tried - GPL too. See http://www.best.com/~pjl/software/swish/