Forum OpenACS Development: Re: Possible issue in file storage

Collapse
Posted by Gustaf Neumann on
The situation is more complicated as it seems at first sight.

a) The behavior should be the same for uploading files via zip files, or a single files. Therefore, one should sanitize always or never.

b) I had a hard time to find, where the "view" method is used. On OpenACS.org, all file-links are "download" links.

c) you say, that "download" works everywhere, but "view" has problems. Why are the files accessed differently in "view" and "download"?

d) urlencoding: the RFC 3986 defines url-encoding differently for the "path" and the "query" part of the URL. Aolserver ignores the difference, and performed everywhere (incorrectly) the "query" variant. In OpenACS+naviserver, the urls (and therefore the path_info) are decoded correctly, but not, if someone encodes the path-part of the URL with the query-part encoding. Therefore, the place, where the "view" link is generated, should use the "path" encoding.

How do you get the "view" link? is this a custom code?

Collapse
Posted by Antonio Pisano on
The tcl file responsible for most of the file-storage gui is file-storage/www/folder-chunk.tcl

In there you will see that the view method is used whenever the parameter 'like_filesystem_p' is set to false in the system.

The view url really executes a vuh file in file-storage/view/. The problem for this page is that it retrieves files by their 'name' field, thus leading to our quoting hell.

'download' url calls a vuh script in file-storage/downlod/ folder. This page retrieves files by their id, a much safer approach.

In the folder-chunk tcl, when building the 'view' link, name field is passed through the ad_urlencode proc. Maybe the inconsistency in quotation happens here, as this proc replaces spaces with '+', instead than '%20', and "ad_conn path_info" is not able to decode them properly. One of the two procs could be wrong.

Of course, one could just set the parameter to true and never deal with the view page again...

Collapse
Posted by Gustaf Neumann on
Antonio,

Thank you for these pointers. Normally, i try to avoid file-storage. I've added a function ad_urlencode_path to handle the differences between aolserver and naviserver and added these to the folder_chunk in file-storage (btw., the encoding was not done consistently there either, some branches used the encoding for $title, some not).

This version is no supposed to work with both servers for viewing sanitized and non-sanitized names.

I've not touched the inconsistency of the names for uploading single files vs. uploading via zipped files. It seems to me, as if the original intention was to permit arbitrary names in the name attribute of the content repository, but to convert these when downloading via fs::get_file_system_safe_object_name, but folder-chunk does it the other way round, sanitize the file_upload_name with fs::remove_special_file_system_characters, which is a helper of fs::get_file_system_safe_object_name. This would mean that the zip-behavior is as intended.

i would not wonder if one could get as well different names when downloading single files vs. downloading via zip.

Collapse
Posted by Antonio Pisano on
Thanks for taking care of it, your fix looks legit to me.

Not sure what could be the safest approach about this for a through revision... Letting people have special chars in filenames is fancy and we have the procs to make it work quite easily, but then comes "that special case" and we go mad again...