xowiki::datasource (private)
xowiki::datasource [ -nocleanup ] revision_id
Defined in packages/xowiki/tcl/xowiki-sc-procs.tcl
- Switches:
- -nocleanup (optional, boolean)
- Parameters:
- revision_id (required)
- returns a datasource for the search package
- Partial Call Graph (max 5 caller/called nodes):
- Testcases:
- No testcase defined.
Source code: #ns_log notice "--sc ::xowiki::datasource called with revision_id = $revision_id" set page [::xowiki::Package instantiate_page_from_id -revision_id $revision_id -user_id 0] #ns_log notice "--sc ::xowiki::datasource $page [$page set publish_status]" if {[$page set publish_status] in {production expired}} { # # No data source result for pages under construction # #ns_log notice "--sc page under construction, no datasource" return [list object_id $revision_id title "" content "" keywords "" storage_type text mime text/html] } #ns_log notice "--sc setting absolute links for page = $page [$page set name]" set d [dict merge {mime text/html text "" html "" keywords ""} [$page search_render]] if {![dict exists $d title]} { dict set d title [$page title] } switch [dict get $d mime] { text/html { set content [dict get $d html] if {![string is space $content]} { try { dom parse -simple -html <html>$content doc $doc documentElement root foreach n [$root selectNodes {//script|//noscript|//style//nav|//button}] { $n delete } set content [$root asHTML] } on error {errorMsg} { ns_log notice "xowiki::datasource: could not parse result of search_render for page $page: $errorMsg" } } # # The function ad_html_text_convert can take forever on largish # files, when e.g. someone loads a huge content into xowiki. # So, when available, and performance is an issue, one could # consider to use "ns_striphtml", but this produces no nice # rendering, so text base syndication will suffer. For now, # "ns_striphtml" is deactivated. # if {0 && [info commands ns_striphtml] ne ""} { set text [ns_striphtml [dict get $d html]] } else { set text [ad_html_text_convert -from text/html -to text/plain -- [dict get $d html]] #set text [ad_text_to_html -- [dict get $d html]]; #this could be used for entity encoded html text in rss entries } # If the html contains links (which are rendered by ad_html_text as [1], [2], ...) # then we have to use CDATA in the description # if {[string first {[1]} $text] > -1} { append description {<![CDATA[} \n $content { ]]>} } else { set description [ns_quotehtml $text] } } text/plain { set content [dict get $d text] set description $content } default { ns_log error "can't handle results of search_render of type '[dict get $d mime]'" set content "" set description "" } } #ns_log notice "--sc INDEXING $revision_id -> $text keywords [dict get $d keywords]" # # cleanup old stuff. This might run into an error, when search is not # configured, and therefore txt does not exist. TODO: we should look for a better # solution, where syndication does not depend on search.... # $page instvar item_id if {[::xo::db::require exists_table txt]} { ::xo::dc dml delete_old_revisions { delete from txt where object_id in (select revision_id from cr_revisions where item_id = :item_id and revision_id != :revision_id) } } set pubDate [::xo::db::tcl_date [$page set publish_date] tz] set link [$page detail_link] set result [list object_id $revision_id title [dict get $d title] content $content keywords [dict get $d keywords] storage_type text mime [dict get $d mime] syndication [list link [string map [list & "&"] $link] description $description author [$page set creator] category "" guid "$item_id" pubDate $pubDate] ] if {!$nocleanup_p && [catch {::xo::at_cleanup} errorMsg]} { ns_log notice "cleanup in ::xowiki::datasource returned $errorMsg" } return $resultXQL Not present: Generic, PostgreSQL, Oracle