Forum OpenACS Development: Issues importing data into XoWiki

Request notifications

Collapse
Posted by Frank Bergmann on
Hi All,

I'm trying to import a number of HTML pages into our ]po[ Documentation XoWiki:

AOLserver: 4.5.1
OpenACS: 5.4.0
XoTCL-Core: 0.106
XoWiki: 0.118

I'm using the script from the XoWiki manual to import HTML pages:

## A small script to load the OpenACS documentation into xowiki
## by Gustaf Neumann
## load doc pages into xowiki -gustaf neumann

# include path for loading tdom
lappend auto_path /usr/local/aolserver4/lib
package require tdom
package require XOTcl; namespace import ::xotcl::*
package require xotcl::serializer

namespace eval ::xowiki {
Class create Page -parameter { {lang en} {description ""} {text ""}
{nls_language en_US} {mime_type text/html} name title text
}
set docpath /export/projop/web/data-model/Functions
set c 0
foreach docpage [glob "$docpath/*.html"] {
set f [open $docpage r]; set data [read $f]; close $f
dom parse -html $data doc
$doc documentElement root
set content ""
foreach n [$root selectNodes //body/*] { append content [$n asHTML] \n }
set p [Page create page[incr c] \
-name en:[file tail $docpage] \
-title [[$root selectNodes //title] asText] \
-text [list $content text/html]]
puts [$p serialize]
}
}
doc_return 200 text/html OK

Here is what I get:

- The original script gave me an error. I had to move the "set docpath ..." into the "namespace eval" block.

- The first time I run I get the "OK", but nothing happens. The "page1" isn't there and the Xowiki Admin page shows the same number of pages.

- The 2nd time I run the script I get a error "::xowiki::require_folder: unable to dispatch method 'package_id' during '::xowiki::require_folder package_id'" when looking at the index page. The error is gone after restarting the server.

Btw., the Wiki isn't finished yet, but it starts to contain reasonable information on the structure of ]po[, packages, object types, categories etc. Have a look at: http://www.project-open.org/documentation/. Contents are licensed as Creative Commons.

Cheers!
Frank

Collapse
Posted by Gustaf Neumann on
Dear Frank,

not sure what you expect from the script. The sample script (which you used as source and was modified by you) is most likely from http://alice.wu-wien.ac.at:8000/xowiki-doc/#import-export-from-other-sources

This script is written to be run outside of OpenACS from the shell script level. Within OpenACS, there would be no need to load tdom, serializer, etc. Notice that the script defines the class ::xowiki::Page, if you do this from within OpenACS, the xowiki pages on the connection thread will stop working. The docpath was a kind of a safty measure to avoid running from within OpenACS.

Anyhow, if you want to load HTML files into xowiki, adjust the paths, run the script, save the output to a file and load this file via "import" in xowiki/admin. In most cases, you will like to cleanup the pages to remove some elements, or to adjust links, etc. I would recommend to make the cleanup via tdom as well (in the foreach loop). This script is just a skeleton to start with.

hope this helps,
-gustaf neumann

Collapse
Posted by Frank Bergmann on
Hi Gustaf,

Thanks a lot for your reply!

Cheers!
Frank