Forum OpenACS Development: cr_folder_type_map bloat

Collapse
Posted by Gustaf Neumann on
This is a posting containing a warning about a bloat in the cr_folder_type_map bloat for large sites (e.g. based on DotLRN.)

The file-storage has the bad property to register essentially all objects types under cr_revisions for every folder created. On our production site, it registers for every newly created folder currently 60+ cr-types in the cr_folder_type_map, although just three content types are necessary (content_folder, content_extlink, and file_storage_object). In practice, this means that we have more than 40 mio entries in our cr_folder_type_map, where most of these are from the file-storage package.

production=# select count(*) from cr_folder_type_map ;
  count
----------
 40150804
(1 row) 
On another large site i checked, i found more than 90 Mio (!) entries, where most of these are useless.

I've added a upgrade script [1] which fixes this bloat. However, note that on large sites, this will run a very long time. So, site-admis might choose to upgrade via apm-package manager, skip the upgrade script, and do the upgrade in smaller steps. Unfortunately, there is no easy and quick way to select all folders from the file-storage due sins of the past, so i had to go over the tree_sortkey.

For smaller sites, the performance is fine (just run this on openacs.org in a fraction of a second). Large site should probably create a temporary table with all file-storage folders (part up the upgrade script) and delete the superfluous entries based on chunks from this table. Unfortunately, fs_folders() returns not only folders from the file-storage. Maybe, someone finds a way to speed this up.

best regards
-gn

[1] http://cvs.openacs.org/changelog/OpenACS?cs=oacs-5-8%3Agustafn%3A20141203162309