Carl,
a significant part of the problem is ad_urlencode, which does the encoding. The same applies for all urls, which are generated by openacs, so the correct fix for removing the spurious encoding is ad_urlencode, or even better ns_urlencode. btw, since ns_urlencode already encodes "_" as %5f (unneeded according to rfc 3986, see unreserved characters), ad_urlencode contains pre/postprocessing hacks. oacs uses a mix of ns_urlencodes and ad_urlencodes already (ns_urlencodes is used about 3 times more than ad_urlencode). I mailed about the unneeded encodings in ns_urlencode to the aolserver mailing list in dec 2005, but got no reply. in contrary to aolserver, naviserver appears to have a rfc 3986 compliant url encoder/decoder, using a different interface.
concerning 1: urlencoding uses + for spaces. this deviates from the standard. by doing so naively, one looses the ability to distinguish between "a b" and "a_b". Is there a specification, what mediawiki does in detail?
concerning 2: notice that RFC 1738 was replaced by RFC 3986, which says: for "unreserved characters", defined as
unreserved = ALPHA / DIGIT / "-" / "." / "_" / "~"
no percent-encoded octets should be created. URIs that differ in the replacement of an unreserved character with its corresponding percent-encoded US-ASCII octet are equivalent: they identify the same resource.
concerning 3: what is the connection between content negotiation and language selection in xowiki? what is your exact proposal?
i am open to discuss every change proposal, but strictly against hacks.